Closed mdumitrean closed 4 months ago
First, I am not sure what you are doing with your thread-pool. DataFrame was meant to work only with its own thread-pool. Second, I don't know what kind of machine/computer you have. But you are creating 264k threads. You must have a NASA size machine to tolerate that many threads :-)
Please read the DataFrame documentation from beginning to end, emphases on the multithreading section -- it is not that long.
After that if you still have a problem, send me the stack trace. I might be able to give you hint.
I did read the documentation. I must have missed something or am not understanding something. These are 24 independent threads created by my threadpool. When I am debugging this I see only 24 worker threads that I’m launching, but I am creating 11k individual dataframes; I have not enabled multithreaded behavior inside any dataframe object. The threads represent database connections reading data (24 connections at once).The back trace is long and deep inside dataframe. The code should compile and crash. Confirmed with g++-14 and g++-13. Best, Marius
I've attached backtrace and 24 threads being created.
ThreadPool class is included for convenience and widely used from here: https://github.com/progschj/ThreadPool
Ok, I suggest you read the multithreading section in the documentation again. You are using the DataFrame in a multithreaded environment. You must call set_lock()
and provide a spin lock for DataFrame. Read docs + look at code samples provided.
Dear Hossein, I guess I didn't provide the spin lock version of the code for the example, I had read and tried adding a spin lock in the code. set_lock with a spinlock around the dataload into the dataframe didn't help at all.
Best, Marius
What I can surmise from the stack trace you provided is that the threads are overwriting themselves in the static members of hetero vectors.
maybe provide a snippet of the code where you set the lock and how you define the lock
class MinuteObservationFetch {
private:
hmdf::SpinLock spinLock;
public:
void createRandom(string t) {
....
hmdf::StdDataFrame<uint64_t>::set_lock(&spinLock);
MinuteObservationFetchTicker ticker(std::move(t));
ticker.loadData(std::move(id_vec), std::move(name_vec), std::move(unix_vec), std::move(date_vec), std::move(symbol_vec),
std::move(open_vec), std::move(high_vec), std::move(low_vec), std::move(close_vec), std::move(volume_vec),
std::move(volume_usd_vec));
insertInMap(t, std::move(ticker));
hmdf::StdDataFrame<uint64_t>::remove_lock();
....
Makes no difference. Tried many variations.
Oh wow.
I think I got it. It's finally no longer dying. The lock has to be not around the actual creation of each dataframe and write to dataframe, individually, but encompassing the entire thread creation.
void createRandom(string t) {
vector<uint64_t> id_vec;
vector<string> name_vec;
vector<uint64_t> unix_vec;
vector<string> date_vec;
vector<string> symbol_vec;
vector<double> open_vec;
vector<double> high_vec;
vector<double> low_vec;
vector<double> close_vec;
vector<double> volume_vec;
vector<double> volume_usd_vec;
for (int j = 0; j < 20000; j++) {
id_vec.push_back(j);
name_vec.emplace_back("NAME" + to_string(j));
unix_vec.push_back(j);
date_vec.emplace_back("DATE" + to_string(j));
symbol_vec.emplace_back("SYMBOL" + to_string(j));
open_vec.push_back(j);
high_vec.push_back(j);
low_vec.push_back(j);
close_vec.push_back(j);
volume_vec.push_back(j);
volume_usd_vec.push_back(j);
}
MinuteObservationFetchTicker ticker(std::move(t));
ticker.loadData(std::move(id_vec), std::move(name_vec), std::move(unix_vec), std::move(date_vec), std::move(symbol_vec),
std::move(open_vec), std::move(high_vec), std::move(low_vec), std::move(close_vec), std::move(volume_vec),
std::move(volume_usd_vec));
insertInMap(t, std::move(ticker));
}
void test1() {
cout << "Test1" << endl;
MinuteObservationFetch fetch;
hmdf::SpinLock spinLock;
hmdf::StdDataFrame<uint64_t>::set_lock(&spinLock);
fetch.populate(128);
hmdf::StdDataFrame<uint64_t>::remove_lock();
cout << "Test1 done" << endl;
}
I spoke too soon. The dataframe still eventually corrupted (just a lot less often); I have gone ahead and just used plain vectors and have no more issues. I wish I could have used the DataFrame library, but it's just too sensitive and I couldn't figure out how to get it to work right in high performance many thread environment (128 threads).
If you happen to have one of those stack traces, please post them here
I am creating 11000 DataFrames, the dataframes are populated by a threadpool of 24 threads and stored in a dictionary.
I am finding sigsegv or abort deep inside DataFrame. I don't understand what's going on here, would really appreciate it if you can help me out.
DataFrameConcurrencyTests.txt