ceccocats / tkDNN

Deep neural network library and toolkit to do high performace inference on NVIDIA jetson platforms
GNU General Public License v2.0
718 stars 209 forks source link

How to use tkdnn with multithread? #208

Closed mochechan closed 2 years ago

mochechan commented 3 years ago

I failed to use tkdnn with multithread programming. It results "Segmentation fault". I found that the key point is detNN->update by using gdb.

detNN->update(batch_dnn_input,1,false, &times, false); // not support std::async

It is still no successful even through unique_lock is used.

std::unique_lock<std::mutex> lck (mtx,std::defer_lock);
lck.lock();
detNN->update(batch_dnn_input,1,false, &times, false); // not support std::async
lck.unlock();

The gdb shows the following debug messages.

Thread 36 "y4rspl" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fc3bffff700 (LWP 210105)]
__memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:257
257 ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S: No such file or directory.
(gdb) where
#0  0x00007fc4eccc6a5f in __memmove_avx_unaligned_erms ()
    at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:257
#1  0x00007fc4f18c45e3 in tk::dnn::Yolo3Detection::preprocess(cv::Mat&, int) (this=0x55fca6bda600 <yolo>, frame=..., bi=0) at /sdd1/workout/tk/tkDNN/src/Yolo3Detection.cpp:88
#2  0x000055fca689afdf in tk::dnn::DetectionNN::update(std::vector<cv::Mat, std::allocator<cv::Mat> >&, int, bool, std::basic_ofstream<char, std::char_traits<char> >*, bool) (this=0x55fca6bda600 <yolo>, frames=std::vector of length 1, capacity 1 = {...}, cur_batches=1, save_times=false, times=0x55fca6bdc8c0 <times>, mAP=false) at /sdd1/workout/tk/tkDNN/include/tkDNN/DetectionNN.h:114

How to use tkdnn with multithread or std::async?

mive93 commented 3 years ago

Hi @mochechan, are you using different batches in the different threads? Moreover, is the mutex the same in all the threads? I don't see why it shouldn't work, when using the lock. In any case, this should not segfault but one thread could overwrite the results of another one. You should copy the detections before unlocking.

mive93 commented 2 years ago

Closing for inactivity. Feel free to reopen.