lamuguo / re2

Automatically exported from code.google.com/p/re2
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

dense_ being resized on construction in sparse_array.h #98

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. http://code.google.com/p/re2/source/browse/util/sparse_array.h#422 
dense_.resize()
2. http://code.google.com/p/re2/source/browse/util/sparse_array.h#420 new 
int[max_size]
3. Both [1] and [2] initialize the memory to their respective initial or empty 
values.

What is the expected output? What do you see instead?

You should not initialize memory till you actually need it as mentioned in 
http://research.swtch.com/sparse
It doesn't seem that we are saving on the initialization cost by using 
sparse_array.

Specifically, this comment at the top of the file seems misleading:

"// Allocating the array is a constant time operation"
"// when memory allocation is a constant time operation."

In this case, allocating the array isn't a constant time operation, since the 
constructor for Value is invoked as part of resizing dense_.

What version of the product are you using? On what operating system?

Doesn't matter (this is in the source code).

Please provide any additional information below.
NOTE: If you have a suggested patch, please see
http://code.google.com/p/re2/wiki/Contribute
for information about sending it in for review.  Thanks.

Original issue reported on code.google.com by dhruvb...@gmail.com on 24 Dec 2013 at 7:34

GoogleCodeExporter commented 9 years ago
This should only matter if you use SparseArray<X> where X has a constructor. We 
use Thread* and int in this source code. If you want to adapt it for use in 
another project, that's fine, but it doesn't seem to matter here.

Original comment by rsc@golang.org on 10 Jan 2014 at 1:42

GoogleCodeExporter commented 9 years ago
Not really. If you follow the calls on dense_.resize(), you get this:

resize()
_M_default_append()
__uninitialized_default_n_a()
_Construct_default_a_impl()
::new((void *)__p) _Tp(std::forward<_Args>(__args)...)

And the call to __uninitialized_default_n_a() includes a for-loop that performs 
O(n) iterations over the new elements to default construct them, so you end up 
paying the cost there. This becomes noticeable when the array is very sparse.

Original comment by dhruvb...@gmail.com on 10 Jan 2014 at 4:03

GoogleCodeExporter commented 9 years ago
But Thread* and int have no default constructor. What work can it possibly be 
doing?

Original comment by rsc@swtch.com on 10 Jan 2014 at 4:18

GoogleCodeExporter commented 9 years ago
Just executing the for loop 'N' times involves O(N) comparisons and O(N) 
integer increments. That's work.

Also, default initialization of pointer and integral types involves zeroing out 
the memory, which means that in a demand paged system, you invoke the demand 
paging code, which ends up assigning real memory to the virtual addresses you 
initialize with zeros.

Original comment by dhruvb...@gmail.com on 10 Jan 2014 at 10:18

GoogleCodeExporter commented 9 years ago
Are you saying that if I have

vector<int> v;
v.resize(10);

that the now 10 entries in v are guaranteed to be zeroed? Because I don't
believe that.

Russ

Original comment by rsc@golang.org on 10 Jan 2014 at 2:37

GoogleCodeExporter commented 9 years ago
It seems that all entries in v are guaranteed to be zeroed.

1. 
http://stackoverflow.com/questions/3866202/vector-resize-automatic-fill/3866422#
3866422
2. http://stackoverflow.com/a/7413968/155951

Original comment by dhruvb...@gmail.com on 10 Jan 2014 at 3:40

GoogleCodeExporter commented 9 years ago
And: http://stackoverflow.com/a/13029377/155951

Original comment by dhruvb...@gmail.com on 10 Jan 2014 at 3:40