leventov / Koloboke

Java Collections till the last breadcrumb of memory and performance
https://koloboke.com/
1.01k stars 139 forks source link

Strange HashLongSet behavior #32

Open krolen opened 9 years ago

krolen commented 9 years ago

Description might be a bit confusing and non informative but generally here what I've faced with:

The following code:

HashLongSet lal = lookALikes.getA(); 

LOG.info("singleLookALikesNoScore() - topResults size is {}, adding more {} results", topResults.size(), lal.size());

int cnt = 0;
long[] followers = topFollowers.toLongArray();
System.out.println("followers.length = " + followers.length);
for (int i = 0; i < followers.length; i++) {
  long follower = followers[i];
  lal.add(follower);
  if(cnt++ % 1000000 ==0) {
    System.out.println("Added, last value is " + follower);
  }
}
System.out.println("lal.size() = " + lal.size());

Resulting logs are interesting:

singleLookALikesNoScore() - topResults size is 1739131, adding more 3260870 results
followers.length = 1739131
Added, last value is 2157902531

And after that process hangs up for about 5 minutes. I.e. it outputs one line (which means the first value was added) and that is all. After about 5-8 mins it outputs other parts. Strange that before that I was easily able to add 5 mils, 8 mils to other HashLongSets and this one fails on 3 millions.

If I switch from HashLongSet to usual set everything works fine.

Ideas?

leventov commented 9 years ago

Given input is insufficient to investigate the problem. Please either share the data so that I could debug or run JVisualVM during that 5-8 min freeze, to see what actually is consuming time, to narrow the bug searching scope.

krolen commented 9 years ago

I do understand that given input is insufficient. And I will try to get more data but cannot promise that since it is actually happens on a an environment which I have quite a limited access to. Meanwhile I want to give you an input that such thing does happen maybe you can think about something in code that might cause the issue.

2014-11-28 23:36 GMT-05:00 Roman Leventov notifications@github.com:

Given input is insufficient to investigate the problem. Please either share the data so that I could debug or run JVisualVM during that 5-8 min freeze, to see what actually is consuming time, to narrow the bug searching scope.

— Reply to this email directly or view it on GitHub https://github.com/OpenHFT/Koloboke/issues/32#issuecomment-64941366.

leventov commented 9 years ago

Thanks for that. I reviewed the related code, and don't see what might cause such long pause. There are two potentially quite long (but still should be thousands times faster) operations: rehash and changing the free value (zero by default). If the items of the set ("followers") are always positive long values, you might try to construct the set using HashLongSets.getDefaultFactory().withKeysDomain(0, Long.MAX_VALUE).newUpdatableSet() and see what happens in this case.

If there is an issue with changing the free value or rehash, I would also note that this functionality is shared with all hash maps with long keys also. So using HashLongIntMap (for example) should also encounter "freeze". If you could check this it would be helpful.