Add _unsafe version of lazy_emplace - Githubissues

greg7mdp / parallel-hashmap

A family of header-only, very fast and memory-friendly hashmap and btree containers.

https://greg7mdp.github.io/parallel-hashmap/

Apache License 2.0

2.53k stars 239 forks source link

Add _unsafe version of lazy_emplace #164

Closed bpmckinnon closed 2 years ago

bpmckinnon commented 2 years ago

Hello, I'm looking to add a lock bypassing version of lazy_emplace but I'm seeing a few ways that it's been done for the other functions so I wanted to clarify the preferred way of doing this. I can either make an _unsafe version of the function, or a _impl version of the function that exposes the lock as a template parameter. Which do you prefer?

greg7mdp commented 2 years ago

Just curious, why do you need a lock bypassing version of lazy_emplace?

bpmckinnon commented 2 years ago

I'm reconstructing a lookup table that maps a hash of binary data to a location in a buffer. I know each entry is unique, so this one would actually be better with an emplace, but the emplace seems a bit more complicated to modify. The remap is done at load time, so I'm single threaded, but after that the hashmap is used from multiple threads.

greg7mdp commented 2 years ago

I'm wondering if it would not be better to be able to turn off/on all locking in a container by calling a function, which would set a bool, which would be checked in LockableBaseImpl.

greg7mdp commented 2 years ago

or maybe you could use map[key] = value which doesn't lock.

bpmckinnon commented 2 years ago

That is definitely easiest way to do it. I didn't realize that didn't use the lock... using the would remove the need to have any of those _unsafe functions

greg7mdp commented 2 years ago

This can't use the lock because map[key] returns a reference.

bpmckinnon commented 2 years ago

I've check this, and operator[] is calling into parallel_hash_map::try_emplace_impl which is locking the lock.

greg7mdp commented 2 years ago

Oh, my bad. I think the best way forward is to add a flag to the phmap which, when set, would disable internal locking. I can do it, but probably not before next weekend.

bpmckinnon commented 2 years ago

Would it make sense to make a hard split between functions that are thread-safe and functions that are not? So anything that returns a reference or iterator does not use the lock, anything that works within the functors is done behind a lock.

greg7mdp commented 2 years ago

I don't think it would be very useful, since in a multi-threaded context you wouldn't want to use these functions, whether they use a lock or not, as they are intrinsically unsafe.

bpmckinnon commented 2 years ago

I've posted a proof concept for adding a bypass_lock function to the parallel_flat_hash_map. I'm not sure that it's better than the bool, but I'm also not total sure about making the phmap's stateful without some kind of scoping mechanism. Let me know what you think, and if it looks like a good idea I can finalize it. https://github.com/bpmckinnon/parallel-hashmap/tree/implement_bypass_lock

bpmckinnon commented 2 years ago

Just wanted to follow up on what you thought about this change?

greg7mdp commented 2 years ago

Hey @bpmckinnon , sorry for not responding for so long. This is interesting, but I would have to review it carefully. I'm also not sure it is better than the bool (for which we could provide a raii class temporarily disable locking). Did you use you solution yourself successfully?

greg7mdp commented 2 years ago

Also maybe measure whether the savings (of disabling locking) are significant. It may not make a big difference.

bpmckinnon commented 2 years ago

It does work. I'll verify the timings.

bpmckinnon commented 2 years ago

I ran it on my biggest test set a bunch of times and the best results were 13.2sec for the locking version and 10.7sec for the bypass lock version with a range of around +.5sec for both. So a 20%-ish improvement.

greg7mdp commented 2 years ago

OK, so it is a worthwhile endeavor, thanks for doing the measurements! I thought about the best way to support your use case and I think I came up with a much simpler solution. Why about making swap() a template so you can swap two parallel_hash_set containers which have different mutex types?

greg7mdp commented 2 years ago

Made the changes in this branch... what do you think?

bpmckinnon commented 2 years ago

It looks good to me

greg7mdp commented 2 years ago

Cool, I'll merge it then. Thanks for pushing me to make the parallel containers more usable.

bpmckinnon commented 2 years ago

Hey Greg, would it be possible to get a new release with these changes?

greg7mdp commented 2 years ago

Sure, will do it later today.

greg7mdp commented 2 years ago

Hey @bpmckinnon release is out https://github.com/greg7mdp/parallel-hashmap/releases/tag/1.36!

bpmckinnon commented 2 years ago

Awesome! Thanks!

bpmckinnon commented 2 years ago

Are the package releases to conan.io done automatically?

greg7mdp commented 2 years ago

Ha, no, I'll have to do that. I just did vcpkg but I had a weird issue building the package on windows (one gtest undefined symbol).

greg7mdp commented 2 years ago

https://github.com/conan-io/conan-center-index/pull/13104

greg7mdp commented 2 years ago

https://github.com/conan-io/conan-center-index/pull/13104

bpmckinnon commented 2 years ago

Awesome! I'm ready to roll! Thanks!

jrcavani commented 1 year ago

Hi @greg7mdp , does this mean the pattern

map[key] = val;

actually is thread-safe? I noticed your answer from 2020 mentioning the same thread-unsafeness but wonder if this has changed and = assignment is now thread-safe.

greg7mdp commented 1 year ago

@jrcavani This is still not thread-safe as the reference returned by operator[]() will be invalid for example if another thread inserts a value in the phmap causing it to resize.

jrcavani commented 1 year ago

I see... It's pretty subtle but there is a difference: two steps: 1. [] returns reference, 2. reference gets used somehow, which could break down for reads and writes. So even 1 is ok, 2 is not.

Sometimes it's hard to see how simple operations like map[key]++ or map[key] = val can cause a segfault and it's hard to reproduce, but it's good to remember the fine prints.

From an uninitiated perspective (mine before digging in), I would think as long as threads are not writing to overlapping keys, the references would be fine, and locking was really for those overlapping key cases, but apparently any modification to the map could invalidate other references, is that true?

bpmckinnon commented 1 year ago

With any unprotected operation in a multi-threaded environment there is a risk of working with invalid data. It could be deleted or modified by another thread. The biggest risk is that the buffer is resized on a new insertion. The parallel map makes this less likely but it does still happen.

greg7mdp commented 1 year ago

yes @bpmckinnon is correct.

jrcavani commented 1 year ago

Thank you both for the clarification. This is unrelated with pointer stability of the data inside each value? e.g. if I use a phmap::parallel_node_hash_map() in a multithreaded environment, and with a lock, take out a pointer to the data inside the value part, it is guaranteed that, if no other threads have erased or overwritten that very key, the pointer is valid.

greg7mdp commented 1 year ago

@jrcavani this is true only if no other thread inserts new values into the hash map.

jrcavani commented 1 year ago

Oh... That's rather confusing.

The flat hash maps will move the keys and values in memory. So if you keep a pointer to something inside a flat hash map, this pointer may become invalid when the map is mutated. The node hash maps don't, and should be used instead if this is a problem.

What's different from this statement? Is it the difference between insertion and all other mutations? And does the parallel_node_hash_map have the same level of pointer stability guarantee as std::unordered_map?

greg7mdp commented 1 year ago

No, std::unordered_map guarantees pointer stability. None of the flat phmap containers do (because they use open addressing which is faster), but requires that the values stored are moved to a different memory location when the container resizes.

jrcavani commented 1 year ago

Yeah I understand that. How about the node hash maps?

greg7mdp commented 1 year ago

The values don't move in memory, at the cost of an additional indirection. It is fine if the inserted values are large.

jrcavani commented 1 year ago

OK. Thank you! Sorry for hijacking the closed thread for the pointer question @bpmckinnon. It's too fascinating/critical :).