martinus / robin-hood-hashing

Fast & memory efficient hashtable based on robin hood hashing for C++11/14/17/20
https://gitter.im/martinus/robin-hood-hashing
MIT License
1.5k stars 142 forks source link

Segmentation fault: 11 with std::this_thread::get_id(); as key in C++ #150

Closed ksahlin closed 2 years ago

ksahlin commented 2 years ago

Hi,

Thanks for a great implementation. I've been using it with success in my projects.

I may have found a bug (Segmentation fault) which happens when inserting std::this_thread::get_id() as key in a robin_hood::unordered_map. It only failes for some runs, but never fails when I switch robin_hood::unordered_map to std::unordered_map. This leads me to believe that there are some thread_ids that cause an unexpected address lookup in the robin_hood version.

Some relevant lines of code are below.

#include <thread>
#include "robin_hood.h"

void some_func(robin_hood::unordered_map<std::thread::id, logging_variables> &log_stats_vec){
    // work
    auto thread_id = std::this_thread::get_id();
    if (log_stats_vec.find(thread_id) == log_stats_vec.end()) { //  Not initialized
        logging_variables log_vars;
        log_stats_vec[thread_id] = log_vars;  <- this initialization fails (BAD_ACCCES in debug mode)
    }
    //  more work
}
slavenf commented 2 years ago

This is not thread safe library. You are writing from multiple threads in log_stats_vec.

martinus commented 2 years ago

As @slavenf says, if you call some_func from multiple threads without guarding with a mutex you have a problem. Note that std::unordered_map is also not thread safe.

ksahlin commented 2 years ago

Yes, that makes perfect sense. I have been struggling with a multithreading buserror for quite some time. Initially had vectors with logging stats per position in the vector (as hinted from variable names) then changed to hash tables in some desperate attempt.

Thanks for the quick reply!

ksahlin commented 2 years ago

Although, @martinus - if I:

(i) reserve enough slots in the table to prevent resizing of the table before threads start working, and (ii) each thread writes to its respective slot (which is the idea to remove the overhead of locks/critical sections),

would (i) and (ii) still pose a problem even if the implementation is not thread-safe? It is sort of discussed here (but in the context of writing to different slots in an array). In the attempt that generated my reported error, I have a hash table of thread_id : struct as key-value pairs.

martinus commented 2 years ago

that won't help because elements are shuffled around each time you insert/remove. Use a mutex.