Closed bpmckinnon closed 2 years ago
Just curious, why do you need a lock bypassing version of lazy_emplace?
I'm reconstructing a lookup table that maps a hash of binary data to a location in a buffer. I know each entry is unique, so this one would actually be better with an emplace, but the emplace seems a bit more complicated to modify. The remap is done at load time, so I'm single threaded, but after that the hashmap is used from multiple threads.
I'm wondering if it would not be better to be able to turn off/on all locking in a container by calling a function, which would set a bool, which would be checked in LockableBaseImpl
.
or maybe you could use map[key] = value
which doesn't lock.
That is definitely easiest way to do it. I didn't realize that didn't use the lock... using the would remove the need to have any of those _unsafe functions
This can't use the lock because map[key]
returns a reference.
I've check this, and operator[] is calling into parallel_hash_map::try_emplace_impl which is locking the lock.
Oh, my bad. I think the best way forward is to add a flag to the phmap which, when set, would disable internal locking. I can do it, but probably not before next weekend.
Would it make sense to make a hard split between functions that are thread-safe and functions that are not? So anything that returns a reference or iterator does not use the lock, anything that works within the functors is done behind a lock.
I don't think it would be very useful, since in a multi-threaded context you wouldn't want to use these functions, whether they use a lock or not, as they are intrinsically unsafe.
I've posted a proof concept for adding a bypass_lock function to the parallel_flat_hash_map. I'm not sure that it's better than the bool, but I'm also not total sure about making the phmap's stateful without some kind of scoping mechanism. Let me know what you think, and if it looks like a good idea I can finalize it. https://github.com/bpmckinnon/parallel-hashmap/tree/implement_bypass_lock
Just wanted to follow up on what you thought about this change?
Hey @bpmckinnon , sorry for not responding for so long. This is interesting, but I would have to review it carefully. I'm also not sure it is better than the bool (for which we could provide a raii class temporarily disable locking). Did you use you solution yourself successfully?
Also maybe measure whether the savings (of disabling locking) are significant. It may not make a big difference.
It does work. I'll verify the timings.
I ran it on my biggest test set a bunch of times and the best results were 13.2sec for the locking version and 10.7sec for the bypass lock version with a range of around +.5sec for both. So a 20%-ish improvement.
OK, so it is a worthwhile endeavor, thanks for doing the measurements! I thought about the best way to support your use case and I think I came up with a much simpler solution. Why about making swap()
a template so you can swap two parallel_hash_set
containers which have different mutex types?
It looks good to me
Cool, I'll merge it then. Thanks for pushing me to make the parallel
containers more usable.
Hey Greg, would it be possible to get a new release with these changes?
Sure, will do it later today.
Hey @bpmckinnon release is out https://github.com/greg7mdp/parallel-hashmap/releases/tag/1.36!
Awesome! Thanks!
Are the package releases to conan.io done automatically?
Ha, no, I'll have to do that. I just did vcpkg but I had a weird issue building the package on windows (one gtest undefined symbol).
Awesome! I'm ready to roll! Thanks!
@jrcavani This is still not thread-safe as the reference returned by operator[]()
will be invalid for example if another thread inserts a value in the phmap causing it to resize.
I see... It's pretty subtle but there is a difference: two steps: 1. []
returns reference, 2. reference gets used somehow, which could break down for reads and writes. So even 1 is ok, 2 is not.
Sometimes it's hard to see how simple operations like map[key]++
or map[key] = val
can cause a segfault and it's hard to reproduce, but it's good to remember the fine prints.
From an uninitiated perspective (mine before digging in), I would think as long as threads are not writing to overlapping keys, the references would be fine, and locking was really for those overlapping key cases, but apparently any modification to the map could invalidate other references, is that true?
With any unprotected operation in a multi-threaded environment there is a risk of working with invalid data. It could be deleted or modified by another thread. The biggest risk is that the buffer is resized on a new insertion. The parallel map makes this less likely but it does still happen.
yes @bpmckinnon is correct.
Thank you both for the clarification. This is unrelated with pointer stability of the data inside each value? e.g. if I use a phmap::parallel_node_hash_map()
in a multithreaded environment, and with a lock, take out a pointer to the data inside the value part, it is guaranteed that, if no other threads have erased or overwritten that very key, the pointer is valid.
@jrcavani this is true only if no other thread inserts new values into the hash map.
Oh... That's rather confusing.
The flat hash maps will move the keys and values in memory. So if you keep a pointer to something inside a flat hash map, this pointer may become invalid when the map is mutated. The node hash maps don't, and should be used instead if this is a problem.
What's different from this statement? Is it the difference between insertion and all other mutations? And does the parallel_node_hash_map
have the same level of pointer stability guarantee as std::unordered_map
?
No, std::unordered_map guarantees pointer stability. None of the flat
phmap containers do (because they use open addressing which is faster), but requires that the values stored are moved to a different memory location when the container resizes.
Yeah I understand that. How about the node hash maps?
The values don't move in memory, at the cost of an additional indirection. It is fine if the inserted values are large.
OK. Thank you! Sorry for hijacking the closed thread for the pointer question @bpmckinnon. It's too fascinating/critical :).
Hello, I'm looking to add a lock bypassing version of lazy_emplace but I'm seeing a few ways that it's been done for the other functions so I wanted to clarify the preferred way of doing this. I can either make an _unsafe version of the function, or a _impl version of the function that exposes the lock as a template parameter. Which do you prefer?