pip010 / sparsehash

Automatically exported from code.google.com/p/sparsehash
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

constructor get stuck in an infinite loop for some values on a 32-bit system #45

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. sparse_hash_map<int> sp_Map(x) for any x between 1<<31 and (1<<32)-1
2. wait forever!
3. are we done with step 2?

What is the expected output? What do you see instead?
constructor should not get stuck in an infinite loop

Please use labels and text to provide additional information.

The constructor in sparsehashtable.h, calls min_size(num_elts, min_buckets) and 
tries to find the smallest 
power of 2 greater than num_elts. This is done by initializing sz to a default 
value of 32 and repeatedly 
multiplying sz by 2 until we find a value which can hold buckets (with af fudge 
factor).

if num_elts is greater than 2^31 the result for sz should be 2^32. On 32-bit 
systems sz overflows to 0 
and gets stuck in an infinite loop since sz *= 2 doesn't modify sz.

Theoretically this is also a problem on 64 bit systems, but I doubt anyone 
would want a hashtable of size 
2^63 (yet).

The code for min_size, taken from trunk
  size_type min_size(size_type num_elts, size_type min_buckets_wanted) {
    size_type sz = HT_MIN_BUCKETS;
    while ( sz < min_buckets_wanted || num_elts >= sz * enlarge_resize_percent )
      sz *= 2;
    return sz;
  }

Original issue reported on code.google.com by pmels...@gmail.com on 9 Dec 2009 at 8:27

GoogleCodeExporter commented 9 years ago
Definitely a problem!  But I'm not sure what the 'right' behavior here would 
be.  sz 
can't be 2^32 because that's too big to fit in 32 bits.

But I'm confused how this could occur in practice.  Do you have a machine where 
size_t is 32 bits but you can address more than 32 bits of memory?  If not, how 
are 
youu planning to get more than 2^31 elements in your hashtable?

I'm trying to understand what the right thing to do here is.  I could make sz 
always 
be 64 bits, but that seems wrong -- it should really be size_type.  I guess I 
could 
just make construction fail somehow if you pass in something > |size_type| / 2?

What particular use-case brought this up?

Original comment by csilv...@gmail.com on 10 Dec 2009 at 12:47

GoogleCodeExporter commented 9 years ago
I'm not sure either what the right behaviour should be. When inserting a delete 
key the hashtable fails with an 
assert, perhaps this would also be an option here.

This came up by accident, I was initializing the hashtable with an unspecified 
value (doh!) and it happened to 
be initialized with 3.2 billion. The program was unresponsive and I had no idea 
why until I debugged it, 
getting a crash would have told me where the problem was.

I don't think there is a valid use case for asking for more than 2^31 elements 
in your table, since they would 
have to occupy 4 bytes each (otherwise there aren't enough distinct ones to 
fill the table), in which case you've 
exhausted the memory. So in my opinion, failing with an assert is the best 
thing to do in this situation. 

Original comment by pmels...@gmail.com on 10 Dec 2009 at 5:02

GoogleCodeExporter commented 9 years ago
Gotcha.  Makes sense to me.  I'll add an overflow assert() in the next release, 
so at
least in debug mode you'll die rather than go into an infinite loop.

Original comment by csilv...@google.com on 10 Dec 2009 at 7:18

GoogleCodeExporter commented 9 years ago

Original comment by csilv...@gmail.com on 10 Dec 2009 at 7:19

GoogleCodeExporter commented 9 years ago
This should be resolved in sparsehash 1.6, just released.

Original comment by csilv...@gmail.com on 11 Jan 2010 at 10:53