greg7mdp / parallel-hashmap

A family of header-only, very fast and memory-friendly hashmap and btree containers.
https://greg7mdp.github.io/parallel-hashmap/
Apache License 2.0
2.53k stars 239 forks source link

Suggestion: lazy_emplace_l without last argument #207

Closed JonathanFry3 closed 1 year ago

JonathanFry3 commented 1 year ago

Surely most users have no need of the third argument to lazy_emplace_l, a lambda called when the search key is not matched to construct the contained object. I certainly did not, and spent a few hours figuring out how to fake it.

I suggest a separate call without that argument that just refers to lazy_emplace_l with "[&](const constructor & ctor){ctor(key);}" as its third argument.

I further suggest giving it the name "emplace_l", since without that third argument, the "lazy" part makes no sense.

I have all this built and tested in my clone.

greg7mdp commented 1 year ago

Hi @JonathanFry3 , thanks for the feedback! Would modify_if work for your use case?

JonathanFry3 commented 1 year ago

Unfortunately, no. If I can't insert or modify depending on whether the key is new in one call, some other thread will get there before my second call.

greg7mdp commented 1 year ago

Wait! So you need the third argument of lazy_emplace_l, otherwise how do you insert?

JonathanFry3 commented 1 year ago

That's why the implementation I suggested calls lazy_emplace_l using to use whatever constructor works with the key as its sole argument (the copy constructor, in my case). This whole suggestion is mostly a convenience improvement, not a functional one. The name is suggested for clarity.

JonathanFry3 commented 1 year ago

Actually, I think insert_l might be a better name, since emplace() take a parameter pack as its argument and this function can't do that. I think the "_l" suffix means "the call you want to use in a multithread environment if you may need to modify an object that is already in the table." Otherwise, this function (the two-argument one) is just insert(key), or it would be if it returned an iterator. Since an iterator is useless in that environment, I can stomach a different return type.

greg7mdp commented 1 year ago

Thanks, got it. I don't think it is very useful as you most likely want to construct an entry that makes sense, not one with the default mapped_value. I already feel that the number of extension functions is getting out of hand, and I don't want to add new ones for special cases.

JonathanFry3 commented 1 year ago

The code constructs an entry by copying the key. Won't that usually be a fully constructed map_value?

greg7mdp commented 1 year ago

For the set version, you are right that the key_type and value_type are the same. For the map version, there is a key_type and a mapped_type, and the mapped_type would have to be default constructed as it is not passed to lazy_emplace_l.

JonathanFry3 commented 1 year ago

But lazy_emplace_l is a parallel_flat_hash_set function. I haven't tried it; does it even work with a map?

greg7mdp commented 1 year ago

Yes, it does work with a map, see the example lazy_emplace_l.cc. You are correct that, for a set, that third argument is less useful, although it sometimes is.

JonathanFry3 commented 1 year ago

Agreed. And if the user has never had need of it, it can be a bit baffling.

greg7mdp commented 1 year ago

I do appreciate the suggestion, but I'm not sure adding yet another API would be a good thing, so I'll close this for now.