Closed indigox4 closed 7 years ago
Hi there,
Thank you using sparsepp, and for for taking the time to report this issue. Indeed iterators may be invalidated when an item is inserted into sparsepp, and that is expected behavior, but I should have stated it clearly in the readme.md (which I will update).
Please note that inserting an item into a std::unordered_map may also invalidate iterators, albeit only if this causes a resize. As stated in the doc for std::unordered_map:
"If rehashing occurs due to the insertion, all iterators are invalidated. Otherwise iterators are not affected. References are not invalidated. Rehashing occurs only if the new number of elements is greater than max_load_factor()*bucket_count()."
Using the temp variable is a good solution. I think insert would also work as the parameters would be evaluated first:
map.insert({keyB, (*iter).second});
Ah ok, that makes sense. I was replacing unordered_map in some third-party code and it never occurred to me the existing code could be mis-using the STL.
Just by changing a few unordered_maps to sparse_hash_maps and fixing that insertion bug, I've managed to speed up our code by about 1.5x, and reduce our memory consumption by 20% as well, so thanks for that!
Well, the invalidation of iterators when inserting will occur more often with sparsepp, so I certainly should mention it in the readme. Really glad to hear that you are happy with sparsepp. Do you use it on windows or linux?
Sounds good, thanks again.
We release our app on Windows, OSX, and Linux, but all of the performance testing I did was on Windows 7.
-------- Original message -------- From: Gregory Popovitch notifications@github.com Date: 2016-12-13 11:27 AM (GMT-05:00) To: greg7mdp/sparsepp sparsepp@noreply.github.com Cc: Stephen indigox3@hotmail.com, Author author@noreply.github.com Subject: Re: [greg7mdp/sparsepp] Possible iterator bug? (#19)
Well, the invalidation of iterators when inserting will occur more often with sparsepp, so I certainly should mention it in the readme. Really glad to hear that you are happy with sparsepp. Do you use it on windows or linux?
- You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/greg7mdp/sparsepp/issues/19#issuecomment-266786041, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ASm-Mv3eam3GkhbwDxWFJF1nyJKafZEFks5rHsdBgaJpZM4LK5gg.
OK. Then you may want to watch for a new version in a couple weeks. I have noticed that on windows, the memory saving is not as good as it should be because the windows default allocator (malloc) fragments the heap and has a very high overhead... so I'm currently implementing a special allocator for sparsepp which will solve this issue. With that new version, the memory savings will be much higher on Windows, and on par with what is currently seen on linux.
Hi and thanks for sharing the code,
I wonder if there has been any progress on this:
I'm currently implementing a special allocator for sparsepp which will solve this issue. With that new version, the memory savings will be much higher on Windows, and on par with what is currently seen on linux
Thanks
Michael
Working on it. Sorry it is taking longer than I had forecast, there are lots of little details to get right. The allocator is working great. I'm now integrating it with sparsepp but that probably will still take a couple weeks.
@mlmsft , Just curious, are you using sparsepp already or is this for a future project?
We are experimenting with it as an alternative to unordered_map, which is a true memory hog. At the moment, we are observing about 20% improvement due to sparsepp, which is already good, but a bit behind original expectations. So,we are looking forward to see sparsepp Windows performance catching up with the Linux version. Incidentally, sparsepp is somewhat difficult to debug in VS, but it will be a small price to pay if it cuts memory requirements.
Thanks
Michael
Great, I think you'll be very pleased with the memory usage with the custom allocator. Why do you say it is difficult to debug in VS? Is it because the code is somewhat cryptic and hard to follow, or is there a specific issue? I myself debug it in VS all the time.
The issue with debugging I was talking about is how VS displays hash contents. Attached is a snapshot showing contents of unordered_map and , below it, sparsepp: In the first case, we see an associative array, in the second case, the view is a bit cryptic.
Oh, thanks for explaining the issue, clearly this is not helpful at all. I'll look into creating a natvis file, so that the debugger can display sparsepp objects in a more useful fashion. Hopefully it wont be too difficult.
Terrific. Looking forward to seeing all those changes in the library. Thanks for being so responsive!
Michael
Hi Michael,
I just added a spp.natvis file which allows user-friendly visualization of sparse_hash_map/set. I Have tested it with Visual Studio 2015, but it may work for 2012/2013.
Yes, visualization works exactly like the built-in implementation now. It might be a small addition, but probably very helpful for VS developers using your library. I have VS 2015, just like you do, so I can only confirm that it works there. Thank you!
Michael
You are welcome!
Hi Greg,
Any updates on memory allocator for Windows?
Thanks
Michael
I am still working on the memory allocator. One thing you could try is to download the file "malloc.c" from http://g.oswego.edu/dl/html/malloc.html, and just add it to your project. It is the excellent allocator by Doug Lea. I would be very curious to hear jow it impacts the memory usage of your app.
@mlmsft , just checked in the version with the new allocator. It is used by default on Windows.Give it a try and let me know what you think!
Hi Greg, Thanks for checking in the update. I didn't get to see its performance yet, mostly due to compilation issues. Few things that caught my (and Visual Studio's) attention:
in spp_dlalloc.h which is likely to collide with preprocessor definitions, especially because many people use #ifdef to see if they are in debug mode. Maybe give the macro a different name?
Best
Michael
Michael, thank you for the quick feedback!
Hi @mlmsft , I just checked in a fix for the first two issues. Hopefully it will build for you now.
I'm replacing some usage of
std::unordered_map
withsparse_hash_map
and I've encountered an occasional issue with iterators.With this code, keyB will occasionally map to some garbage values (0xdddddddd, indicating deleted memory), and not the same value as keyA, which is what I would expect. The same code works fine with
std::unordered_map
.If I store keyA's value in a temp variable everything works as expected:
(Unfortunately, I can't reproduce this issue with a simple test case, only our internal code.)
Using the temp variable is fine for my usage, just curious if what I'm seeing is a bug, or expected behaviour. I saw in the readme that calls to
erase()
can invalidate iterators. Is the same true with insertions as well?