greg7mdp / sparsepp

A fast, memory efficient hash map for C++
Other
1.24k stars 123 forks source link

feedback please! #17

Closed greg7mdp closed 3 years ago

greg7mdp commented 7 years ago

This is not a real issue, but just a space to provide feedback on sparsepp, good or bad. I'd love to hear what you think. Thank you in advance.

anthony-royal-bisimulations commented 5 years ago

Is there a possibility of updating your iterator class to remove the warning from VS2017? They are deprecating inheriting from the std:: iterator class.

greg7mdp commented 5 years ago

Is there a possibility of updating your iterator class to remove the warning from VS2017? They are deprecating inheriting from the std:: iterator class.

@anthony-royal-bisimulations , what warning do you get? I don't see it with VS2017 (version 15.9.4)

greg7mdp commented 5 years ago

Is there a way to use this from golang? thank you

Not that I know, sorry.

anthony-royal-bisimulations commented 5 years ago

what warning do you get? I don't see it with VS2017 (version 15.9.4)

Warning C4996 'std::iterator<iter_type,T,ptrdiff_t,_Ty *,_Ty &>::iterator_category': warning STL4015: The std::iterator class template (used as a base class to provide typedefs) is deprecated in C++17. (The header is NOT deprecated.) ...

It says to use a define to ignore the deprecation warnings which we are doing for now since we compile with warnings as errors.

We are using version 1.22 with the CPP17 preprocessor which enables more c++ 17 features. We are using VS2017 15.6.4.

anthony-royal-bisimulations commented 5 years ago

Its specifically pointing to class Two_d_iterator : public std::iterator<iter_type, T>

greg7mdp commented 5 years ago

I don't know how to reproduce the issue, but I should be able to fix it anyways - will try to get to it today. In the meanwhile you could add #pragma warning(disable : 4996). in your file.

greg7mdp commented 5 years ago

@anthony-royal-bisimulations - warning should be fixed. Thanks for reporting the issue.

ekg commented 5 years ago

I'm interested on your thoughts on https://attractivechaos.wordpress.com/2018/01/13/revisiting-hash-table-performance/. This benchmark suggests that klib's khash is much faster than sparsepp with lower memory usage. If this benchmark isn't flawed, how could sparsepp be improved to match khash's performance?

greg7mdp commented 5 years ago

Hum, maybe I am misunderstanding the graphs, but sparsepp seems to have about the same memory usage as khash.

One thing where sparsepp shines is not the memory usage itself, but the memory high water mark when the table resizes (when inserting more items). In my experience this is often what limits the number of items a table can hold, if you don't know the final size in advance.

In any case, I have found that there is not much point in considering a single specific benchmark. This particular case uses very small keys (one int). Does it measure how fast it is to iterate over the gash table, to delete an element, etc...

The best thing to do is to actually try different hash tables with your own use case. You may be surprised by the results (which don't always match posted benchmarks).

SlowPokeInTexas commented 5 years ago

"The best thing to do is to actually try different hash tables with your own use case. You may be surprised by the results (which don't always match posted benchmarks)."

I would agree with this wholeheartedly. Have multiple arrows in your quiver and be prepared to use the right one for a given project.

On Mon, Feb 11, 2019 at 8:57 AM Gregory Popovitch notifications@github.com wrote:

Hum, maybe I am misunderstanding the graphs, but sparsepp seems to have about the same memory usage as khash.

One thing where sparsepp shines is not the memory usage itself, but the memory high water mark when the table resizes (when inserting more items). In my experience this is often what limits the number of items a table can hold, if you don't know the final size in advance.

In any case, I have found that there is not much point in considering a single specific benchmark. This particular case uses very small keys (one int). Does it measure how fast it is to iterate over the gash table, to delete an element, etc...

The best thing to do is to actually try different hash tables with your own use case. You may be surprised by the results (which don't always match posted benchmarks).

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/greg7mdp/sparsepp/issues/17#issuecomment-462357405, or mute the thread https://github.com/notifications/unsubscribe-auth/ASpl2MorQwtsDWV6lhOC4_lgTRrtwTARks5vMYTfgaJpZM4Kvvz5 .

-- "The best time to plant a tree is 20 years ago. The second best time is today."

murias commented 5 years ago

I've noticed you are shipping your own version of dlmalloc with sparsepp. In the light of other malloc implementations that might be installed system wide (like tcmalloc from gperftools), this seems kind of redundant to me. What influenced your decision to includ dlmalloc? Did you encounter bad performance with the default system malloc? And finally, in your benchmarks, shouldn't all maps use the same allocator to get a more "fair" comparison?

greg7mdp commented 5 years ago

Indeed I ship a dlmalloc based allocator because the default allocator on windows sometimes has a very high memory usage with sparsepp's pattern of allocation. There is some explanation in the README.

in your benchmarks, shouldn't all maps use the same allocator to get a more "fair" comparison?

They do. The custom allocator is not used by default. You have to define SPP_USE_SPP_ALLOC.

greg7mdp commented 3 years ago

Really appreciate the great feedback. Closing the issue as my focus is more on phmap now.