hosseinmoein / DataFrame

C++ DataFrame for statistical, Financial, and ML analysis -- in modern C++ using native types and contiguous memory storage
https://hosseinmoein.github.io/DataFrame/
BSD 3-Clause "New" or "Revised" License
2.54k stars 313 forks source link

crashed but can't find where is the error #173

Closed xkungfu closed 2 years ago

xkungfu commented 2 years ago

for the codes is very complex, I take the most closely related part:

        ULDataFrame resultdf;
        for(auto el : current_board_contains.keys())
        {
            std::string ticker_str = el.toStdString();

            auto functor1 = [ticker_str](const unsigned long &, const std::string &val)-> bool { return (val == ticker_str); };
            ULDataFrame tempdf = sec_realtime_df.get_data_by_sel<std::string, decltype(functor1), double, std::string>("ticker", functor1);

            resultdf = resultdf.concat<decltype(tempdf), double, std::string>(tempdf);
        }

securities_realtime_df's header is :

INDEX:0:<ulong>,id:0:<double>,ticker:0:<string>,name:0:<string>,close_price:0:<double>,change_percentage:0:<double>,change_amount:0:<double>,turnover_vol:0:<double>,turnover_amount:0:<double>,amplitude:0:<double>,highest_price:0:<double>,lowest_price:0:<double>,open_price:0:<double>,last_close:0:<double>,turnover_vol_ratio:0:<double>,turnover_rate:0:<double>,ma5:0:<double>,ma10:0:<double>,ma20:0:<double>,ma30:0:<double>,ma60:0:<double>,trade_date:0:<string>

qt debugger:


1  std::equal_to<hmdf::HeteroVector const *>::operator()                                                                                                                                                                                                                                                                                                                                                                                                                                                          stl_function.h                 356  0x7ff77c70e9bb 
2  std::__detail::_Equal_helper<hmdf::HeteroVector const *, std::pair<hmdf::HeteroVector const * const, std::vector<std::string>>, std::__detail::_Select1st, std::equal_to<hmdf::HeteroVector const *>, unsigned long long, false>::_S_equals                                                                                                                                                                                                                                                                    hashtable_policy.h             1461 0x7ff77c770589 
3  std::__detail::_Hashtable_base<hmdf::HeteroVector const *, std::pair<hmdf::HeteroVector const * const, std::vector<std::string>>, std::__detail::_Select1st, std::equal_to<hmdf::HeteroVector const *>, std::hash<hmdf::HeteroVector const *>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Hashtable_traits<false, false, true>>::_M_equals                                                                                                                        hashtable_policy.h             1834 0x7ff77c70dcd2 
4  std::_Hashtable<hmdf::HeteroVector const *, std::pair<hmdf::HeteroVector const * const, std::vector<std::string>>, std::allocator<std::pair<hmdf::HeteroVector const * const, std::vector<std::string>>>, std::__detail::_Select1st, std::equal_to<hmdf::HeteroVector const *>, std::hash<hmdf::HeteroVector const *>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true>>::_M_find_before_node hashtable.h                    1545 0x7ff77c702a5b 
5  std::_Hashtable<hmdf::HeteroVector const *, std::pair<hmdf::HeteroVector const * const, std::vector<std::string>>, std::allocator<std::pair<hmdf::HeteroVector const * const, std::vector<std::string>>>, std::__detail::_Select1st, std::equal_to<hmdf::HeteroVector const *>, std::hash<hmdf::HeteroVector const *>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true>>::_M_find_node        hashtable.h                    655  0x7ff77c702946 
6  std::_Hashtable<hmdf::HeteroVector const *, std::pair<hmdf::HeteroVector const * const, std::vector<std::string>>, std::allocator<std::pair<hmdf::HeteroVector const * const, std::vector<std::string>>>, std::__detail::_Select1st, std::equal_to<hmdf::HeteroVector const *>, std::hash<hmdf::HeteroVector const *>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true>>::find                hashtable.h                    1419 0x7ff77c713a1d 
7  std::unordered_map<hmdf::HeteroVector const *, std::vector<std::string>>::find                                                                                                                                                                                                                                                                                                                                                                                                                                 unordered_map.h                921  0x7ff77c728ae0 
8  hmdf::HeteroVector::get_vector<std::string>                                                                                                                                                                                                                                                                                                                                                                                                                                                                    HeteroVector.tcc               43   0x7ff77c668f9a 
9  hmdf::DataFrame<unsigned long, hmdf::HeteroVector>::create_column<std::string>                                                                                                                                                                                                                                                                                                                                                                                                                                 DataFrame_set.tcc              63   0x7ff77c67290f 
10 hmdf::DataFrame<unsigned long, hmdf::HeteroVector>::load_column<std::string>                                                                                                                                                                                                                                                                                                                                                                                                                                   DataFrame_set.tcc              452  0x7ff77c66fdf9 
11 hmdf::DataFrame<unsigned long, hmdf::HeteroVector>::sel_load_functor_<unsigned long long, double, std::string>::operator()<std::vector<std::string>>                                                                                                                                                                                                                                                                                                                                                           DataFrame_misc.tcc             607  0x7ff77c67596d 
12 hmdf::HeteroVector::change_impl_help_<hmdf::DataFrame<unsigned long, hmdf::HeteroVector>::sel_load_functor_<unsigned long long, double, std::string>&, std::string>                                                                                                                                                                                                                                                                                                                                            HeteroVector.tcc               174  0x7ff77c6f007f 
13 hmdf::HeteroVector::change_impl_<hmdf::DataFrame<unsigned long, hmdf::HeteroVector>::sel_load_functor_<unsigned long long, double, std::string>&, hmdf::HeteroVector::type_list, double, std::string>                                                                                                                                                                                                                                                                                                          HeteroVector.tcc               221  0x7ff77c6efa3e 
14 hmdf::HeteroVector::change<hmdf::DataFrame<unsigned long, hmdf::HeteroVector>::sel_load_functor_<unsigned long long, double, std::string>&>                                                                                                                                                                                                                                                                                                                                                                    HeteroVector.h                 196  0x7ff77c6f0620 
15 hmdf::DataFrame<unsigned long, hmdf::HeteroVector>::get_data_by_sel<double, BoardSnapshot::dynamic_update_snapshot(DongfancaifuRealtimData)::<lambda(long unsigned int const&, double const&)>, double, std::string>(const char *, BoardSnapshot::<lambda(long unsigned int const&, double const&)> &) const                                                                                                                                                                                                   DataFrame_get.tcc              637  0x7ff77c558764 
16 BoardSnapshot::dynamic_update_snapshot                                                                                                                                                                                                                                                                                                                                                                                                                                                                         boardsnapshot.cpp              226  0x7ff77c55473b 
17 BoardSnapshot::qt_static_metacall                                                                                                                                                                                                                                                                                                                                                                                                                                                                              moc_boardsnapshot.cpp          84   0x7ff77c4f3ec1 
18 void doActivate<false>(QObject *, int, void * *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   0x68b978bb     
19 MainWindow::update_boardsnapshot_frommainwindow                                                                                                                                                                                                                                                                                                                                                                                                                                                                moc_mainwindow.cpp             154  0x7ff77c4f4b30 
20 MainWindow::qt_static_metacall                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 moc_mainwindow.cpp             83   0x7ff77c4f4886 
21 QObject::event(QEvent *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           0x68a91565     
22 QWidget::event(QEvent *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           0xdf40d0       
23 QMainWindow::event(QEvent *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       0xeebff3       
24 QApplicationPrivate::notify_helper(QObject *, QEvent *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            0xdb790e       
25 QApplication::notify(QObject *, QEvent *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          0xdbe3e3       
26 QCoreApplication::notifyInternal2(QObject *, QEvent *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             0x68a656ea     
27 QCoreApplicationPrivate::sendPostedEvents(QObject *, int, QThreadData *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                           0x68a6c745     
28 QWindowsGuiEventDispatcher::sendPostedEvents                                                                                                                                                                                                                                                                                                                                                                                                                                                                   qwindowsguieventdispatcher.cpp 80   0x6a90376e     
29 QEventDispatcherWin32::processEvents(QFlags<QEventLoop::ProcessEventsFlag>)                                                                                                                                                                                                                                                                                                                                                                                                                                                                        0x68abc0b0     
30 QWindowsGuiEventDispatcher::processEvents                                                                                                                                                                                                                                                                                                                                                                                                                                                                      qwindowsguieventdispatcher.cpp 73   0x6a903755     
31 QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            0x68a64405     
32 QCoreApplication::exec()                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           0x68a6d765     
33 main                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           main.cpp                       22   0x7ff77c50454a 

the code work in sub thread and send a signal after finished. the main thread will update UI after recieved the signal.

the process is continuing and iterating. if I do nothing, it work fine.

and now if I create a new widget , when the data of the widget is calculating, sometime , the app crashed. not always crashed,

can you find some clues from the debugger info?

hosseinmoein commented 2 years ago

It would be close to impossible for me to debug a multithreaded app just by looking at the code. But, I suspect you have not set the spin lock. See docs in multithreading section and look at set_lock() in docs

xkungfu commented 2 years ago

Is the function below in "HeteroVector.tcc" at a risk of stackoverflow?

template<typename T>
std::vector<T> &HeteroVector::get_vector()  {

    auto    iter = vectors_<T>.find (this);

    // don't have it yet, so create functions for copying and destroying
    if (iter == vectors_<T>.end())  {
        clear_functions_.emplace_back (
            [](HeteroVector &hv) { vectors_<T>.erase(&hv); });

        // if someone copies me, they need to call each
        // copy_function and pass themself
        copy_functions_.emplace_back (
            [](const HeteroVector &from, HeteroVector &to)  {
                vectors_<T>[&to] = vectors_<T>[&from];
            });

        move_functions_.emplace_back (
            [](HeteroVector &from, HeteroVector &to)  {
                vectors_<T>[&to] = std::move(vectors_<T>[&from]);
            });

        iter = vectors_<T>.emplace (this, std::vector<T>()).first;
    }

    return (iter->second);
}

the function is called by the codes below:

auto functor = [tickerstr](const unsigned long &, const std::string &val)-> bool { return (val == tickerstr); };
ULDataFrame ticker_new_df = realtimedf.get_data_by_sel<std::string, decltype(functor), double, std::string>("ticker", functor);

or: auto row = dailydf.get_data_by_idx<double, std::string>(std::vector<ULDataFrame::IndexType> { i });

the crashes often happened when runing get_data_by_sel or get_data_by_idx.

so, please have a investigation at "get_vector" in HeteroVector.tcc.

thanks!

hosseinmoein commented 2 years ago

No there is no stack overflow. Your problem is described in my original message above. Did you read it?

xkungfu commented 2 years ago

No there is no stack overflow. Your problem is described in my original message above. Did you read it?

yes, I read you answer. and use only one thread to work in background by use codes below:

QThreadPool::globalInstance()->setMaxThreadCount(1);

I have two workflows: the first workflow is for updating ui, the steps are:

1. generate data in back thread.
2. pass the data to main thread by signal.
3. the main thread recieve the data from the signal, then trigger the slot function
4. the slot function handle the data and update gui.

these steps is iterating and continuing. the ui would be updated every 5 seconds.

the second workflow is for creating a new ui is:

when mouse clicking somewhere, open a new window and show some plots in this new window.

both the sub thead and main thread handle some dataframe. they are not same datafame, I think I prevented to use same codes or same variable cross threads. but I am not sure if I wrote the all scripts correctly.

I will have a deep investigation.

thank you very muck!

hosseinmoein commented 2 years ago

I am repeating and clarifying what I said above. I will emphasize it in two points:

  1. You do not need to be single-threaded. You could be multithread and it should work perfectly. But you need to use set_lock(). See documentation for set_lock()
  2. Even if you don't use the same DataFrame in multiple threads, in a multithreaded environment you still need to use set_lock() because DataFrame has static data. Again, see the multithreading section of documentation and set_lock()

Documentation is your friend

xkungfu commented 2 years ago

I got it. let me have a try about this. thanks!

xkungfu commented 2 years ago
RealtimeMonitor.h
-----
struct  DongfancaifuRealtimData  {
    DongfancaifuRealtimData(){};
    ~DongfancaifuRealtimData(){};
    QList<QPair<QString, double>> qpairlistdata;
    QMap<QString, ULDataFrame> qmapdata;
    bool empty = true;
};

RealtimeMonitor.cpp
-----
int RealtimeMonitor::worker() {
    for(int i=0; i< 20000; i++)
    {
        RealtimeData* RD = new RealtimeData();
        DongfancaifuRealtimData realtimedata = RD->dongfangcaifu_realtime_data_handler();

        emit update_boardsnapshot_fromrealmonitor(realtimedata);

        Sleep(5000);
    }
    return 0;
};

void RealtimeMonitor::dongfangcaifu_realtime_handler_2(bool cancle)
{
    //set only one thread:
    QThreadPool::globalInstance()->setMaxThreadCount(1);

    SpinLock                    lock;
    ULDataFrame::set_lock(&lock);
    m_future = QtConcurrent::run(this, &RealtimeMonitor::worker);
    ULDataFrame::remove_lock();
}
mainwidow.cpp:
-----
    BoardSnapshot *m_boardsnapshotwdt = new BoardSnapshot();

    connect(this, SIGNAL(update_boardsnapshot_frommainwindow(DongfancaifuRealtimData)), m_boardsnapshotwdt, SLOT(dynamic_update_snapshot(DongfancaifuRealtimData)));

void MainWindow::on_actionstart_2_triggered()
{
    realmonitor = new RealtimeMonitor();
    realmonitor->dongfangcaifu_realtime_handler_2(false);
    connect(realmonitor, SIGNAL(update_boardsnapshot_fromrealmonitor(DongfancaifuRealtimData)), this, SIGNAL(update_boardsnapshot_frommainwindow(DongfancaifuRealtimData)));
}

It seems set_lock doesn't work for QtConcurrent.

"BoardSnapshot" is the the function to receive dataframe data and update ui. "dongfangcaifu_realtime_handler_2" is the function run in sub thread to generate and send dataframe data by signal.

set_lock was added before QtConcurrent, the app still crashed like it not added.

hosseinmoein commented 2 years ago

Of course it doesn't work. you are setting the lock and removing it immediately.

You need to learn how multithreaded programs in C++ work. How threads work. How mutexes work. What is the concept of a shared resource, ... If you just wing it, that's what you get

xkungfu commented 2 years ago

It looks very hard. let me have a try. get new knowledge from every answer of you.thanks!

xkungfu commented 2 years ago

the example of set_lock in document not work. error message is: terminate called without an active exception Assertion failed: 0, file /include/DataFrame/Utils/ThreadGranularity.h, line 124

hosseinmoein commented 2 years ago

The example for set_lock() works just fine every time. Multithreading is an above intermediate level programming skill. You need to learn C++ and multithreaded programming

xkungfu commented 2 years ago

yes, I am now studying for Multithreading. a little question: can I use QMutex instead of set_lock.

xkungfu commented 2 years ago

really not work, have a look:

20220302232604

hosseinmoein commented 2 years ago

I don't know what you are doing there. It works for me and others every time.

To answer your above question, if you read the multithreading section of documentation, you will see you need two different protections:

  1. To protect internal static data of DataFrame, You must use set_lock() with SpinLock
  2. To protect a single instance of DataFrame in multiple threads, you can use any mutex you want
xkungfu commented 2 years ago

very detailed and clear for this, I would work it out . thank you very much!

xkungfu commented 2 years ago

It is very confused, they are same but they are not equal, pay attention to the output: 20220303012127

xkungfu commented 2 years ago

change if (thrid == owner)
to: if (std::hash{}(thrid) == std::hash{}(owner))

resolved the problem.

hosseinmoein commented 2 years ago

This is not standard C++ You are using a nonstandard compiler https://en.cppreference.com/w/cpp/thread/thread/id

You cannot depend on the hash value of the id, because two different ids may have the same hash value. I would change my compiler to a standard c++17 compiler

xkungfu commented 2 years ago

my complier: MinGW64V800r1 GCCV9.3.1 QT5.15.2

I also changed assert(0) to assert(0 && "[ThreadGranularity.h] error: thread id not equal");

maybe little use.

xkungfu commented 2 years ago
int RealtimeMonitor::worker() {
    for(int i=0; i< 20000; i++)
    {
        if(global.ISREALTIMETASKCANCALED == true)
        {
            global.ISREALTIMETASKCANCALED = false;
            break;
        }
        RealtimeData* RD = new RealtimeData();
        RealtimDataStruct realtimedata = RD->realtime_data_handler();
        emit update_boardsnapshot_fromrealmonitor(realtimedata);
        std::this_thread::sleep_for(std::chrono::seconds(5));
    }
    return 0;
};
        SpinLock lock;
        ULDataFrame::set_lock(&lock);
        std::thread* xx = new std::thread(&RealtimeMonitor::worker, this); 
        xx->detach();
        ULDataFrame::remove_lock();

I resolved set lock in thread. but still crashing when calculating dataframe both in sub thread and main thread at same time.

according to the document, dataframe instance also is not multithreaded safe. but there is not an example. so I am confusing at here.

how to create an instance for datafame? and if I have to create two instances for both sub thread and main thread?

I tried use " StdDataFrame* xdf = new StdDataFrame()" as an instance. but some function not work in this way. such as "dailydf = dailydf->concat<decltype(ticker_last_day_df), double, std::string>(ticker_last_day_df); ".

so, can you give me a method about creating an instance for dataframe?

thanks!

hosseinmoein commented 2 years ago

You are setting the lock and then removing it immediately. You should only remove the lock at the end of the program when everything is done

xkungfu commented 2 years ago

''' void dftest() { for(int i=0; i < 2; i++) { ULDataFrame source_df; std::vector idx = { 123450, 123451, 123452, 123453, 123454, 123455, 123456 }; std::vector d1 = { 1, 2, 3, 4, 5, 6, 7 }; std::vector d2 = { 8, 9, 10, 11, 12, 13, 14 }; std::vector d3 = { 15, 16, 17, 18, 19, 20, 21 }; std::vector d4 = { 22, 23, 24, 25 }; std::vector s1 = { "11", "22", "33", "ee", "ff", "gg", "ll" };

    source_df.load_data(std::move(idx),
                 std::make_pair("col_1", d1),
                 std::make_pair("col_2", d2),
                 std::make_pair("col_3", d3),
                 std::make_pair("col_str", s1));
    source_df.load_column("col_4", std::move(d4), nan_policy::dont_pad_with_nans);

    QList<QString> contains_qvec = {"11", "33", "ff"};

    ULDataFrame resultdf;
    for(auto el : contains_qvec)
    {
        std::string str_value = el.toStdString();

        ULDataFrame tempdf;
        auto functor1 = [str_value](const unsigned long &, const std::string &val)-> bool { return (val == str_value); };
        try  {
            tempdf = source_df.get_data_by_sel<std::string, decltype(functor1), double, std::string>("col_str", functor1);
        } catch (const DataFrameError &ex)  {
            std::cout << "[main>>dftest] [WARNING] get_data_by_sel ERROR! e.what(): " << ex.what() << std::endl;
            return;
        }

        resultdf = resultdf.concat<decltype(tempdf), double, std::string>(tempdf);
    }

    resultdf.sort<double, double, std::string>("col_1", sort_spec::desce);

    unsigned long size2 = resultdf.get_index().size();
    std::vector<unsigned long> ulvec(size2);
    unsigned long index = 0;
    std::generate(ulvec.begin(), ulvec.end(), [&]{ return index++; });

    resultdf.load_column("long_col", std::move(ulvec), nan_policy::dont_pad_with_nans);

    resultdf = resultdf.get_reindexed<unsigned long, double, std::string>("long_col", "OLD_IDX");

    std::this_thread::sleep_for(std::chrono::seconds(5));
}
return;

} ''' ''' int main(int argc, char *argv[]) { SpinLock lock; ULDataFrame::set_lock(&lock); std::thread t1 = std::thread(dftest); //正常 t1.join(); ULDataFrame::remove_lock(); } '''

some functions not work under set lock. such as: get_data_by_sel get_reindexed. when running to the positions of these functions, the app hanging and do nothing, but wouldn't crash.

Did I miss something from document again?

please have a look. thank!

hosseinmoein commented 2 years ago

I am closing this ticket. You need to learn programming in general and C++ in particular. I cannot help you anymore on this ticket

xkungfu commented 2 years ago

very sorry. I got lot of new knowledge from this and will continue to study more, thank you very much.

at now, I temporarily change if (thr_id == owner_) to if (std::hash<std::thread::id>{}(thr_id) == std::hash<std::thread::id>{}(owner_)).

I will take more investigation on this later.

xkungfu commented 2 years ago

@hosseinmoein after debug for two days, I finally find this problem is caused by a conflict between postgresql and mingw64. the comparison operator about thread id can't work after loading postgresql. change findpackage to find_library resolved the problem! thanks again!