Open MinhazPalasara opened 6 years ago
@jgong5 @hshen14 any ideas on this issue?
Is it supposed to be safe to use use Intel Caffe w/ MKLDNN in a multi-threaded application? We did not find any docs definitely saying one way or the other. Any pointers you can give us would be helpful.
@matt-ny The problem should lie in the global stream handler which is now a singleton.
thank you @jgong5 for your answer, however, I'm not sure I understand - are you saying that with the latest MKLDNN code, we should not invoke Classifier.Classify(img);
in multiple threads, even for different classifiers?
Could you share a link to the definition in the code of the global stream handler singleton you mentioned?
Finally, is this also true of Intel Caffe + MKL2017? In our experience, BVLC Caffe w/ MKL has not had any problem with re-entrancy as long as the Net
objects were different. Are there design docs which specify the differences between BVLC / Intel Caffe + MKL / Intel Caffe + MKLDNN?
thanks so much!
@matt-ny Please check "src/caffe/mkldnn_base.cpp". I am not sure about MKL2017 though.
@jgong5 I ran into maybe the same problem, and i compile mkldnn with MKLDNN_ENABLE_CONCURRENT_EXEC = ON
======= Backtrace: ========= /lib64/libc.so.6(+0x75366)[0x7f6fe2f2d366] /home/xxx/intelcaffe/caffe/external/mkldnn/install/lib/libmkldnn.so.0(+0x62d9a)[0x7f6fe691bd9a] /home/xxx/intelcaffe/caffe/external/mkldnn/install/lib/libmkldnn.so.0(+0x619ef)[0x7f6fe691a9ef] /home/xxx/intelcaffe/caffe/external/mkldnn/install/lib/libmkldnn.so.0(mkldnn_stream_submit+0xe0)[0x7f6fe691ab30] /home/xxx/intelcaffe/caffe/.build_release/examples/cpp_classification/../../lib/libcaffe.so.1.1.2(_ZN5caffe15MKLDNNPrimitiveIfE6submitEv+0x4f7)[0x7f6fe65011d7]
so is this because of the same reason? can you tell me how I can fix it? Thanks
Hi,
I have been trying to use Intel Caffe for a project that serves a deep model to multiple users in parallel. We have multiple copies of the model loaded from same .caffemodel file, in case it matters. One copy is used by single thread at a time. I have been getting Segmentation fault for this setting.
To produce this issue and avoid any possible problems with my project I modified the examples/cpp_classification/classification.cpp. I have two threads creating separate network instances running classification on an image in parallel. Intel caffe is compiled for single node with MKLDNN engine.
On running classification I am getting.
May be I am missing some design details, any help would be appreciated.