BVLC / caffe

Caffe: a fast open framework for deep learning.
http://caffe.berkeleyvision.org/
Other
34.03k stars 18.7k forks source link

Segfault on OpenCV 3.3 libopencv_dnn #5968

Open halt9 opened 6 years ago

halt9 commented 6 years ago

Using Ubuntu 16.04, OpenCV 3.3, CUDA 8.0.

When running make runtest I discovered that the caffe command itself segfaults. I compiled with DEBUG and ran caffe in gdb, and discovered:

Thread 1 "caffe" received signal SIGSEGV, Segmentation fault.
0x00007ffff6686a66 in google::protobuf::Arena::AllocateAligned(std::type_info const*, unsigned long) () from /usr/local/lib/libopencv_dnn.so.3.3
(gdb)

I don't have any more intuition as to what line causes this issue. I searched, but this doesn't seem to be an OpenCV issue in particular, but I may be wrong. Anyone know of a fix? Or maybe with some help I can provide some more information.

jukkaho commented 6 years ago

I have seen this problem before with contrib OpenCV 3.2 DNN module that has been moved into main distribution during OpenCV 3.3 development.

The problem is related to initializing the same protobuf object in different libraries (or in main software and a library) AFAIK. OpenCV DNN component tries to support loading Caffe models and contains code using Google's Protobuf for parsing Caffe's protobuf objects for network definitions. When the same parser is initialized multiple times for the same protobuf object (once in OpenCV, other time in Caffe), protobuf library itself crashes. This was a known limitation in Protobuf library the last time I checked it.

Previously you could just leave DNN contrib module out easily and avoid the whole issue. I don't know an easy way to disable that in OpenCV 3.3.

Easiest workaround might just be to use OpenCV 3.2 and leave contrib DNN module out of it.

halt9 commented 6 years ago

That's a shame. Is there any other way around this that doesn't involve reinstalling opencv? I have too many other things that depend on it.

jukkaho commented 6 years ago

At least I don't see a way without recompiling some version of OpenCV. Of course, if you disable OpenCV DNN module on source level (which may or may not be a simple thing, I haven't tried to do that), you can just reinstall it and it should be binary compatible with everything you have already linked against that version of the library as long as they do not depend on DNN functionality.

zingmars commented 6 years ago

Of course, if you disable OpenCV DNN module on source level (which may or may not be a simple thing, I haven't tried to do that)

It should be a simple matter of adding a -DBUILD_opencv_dnn="no" to your cmake command. Odly enough DNN does not appear in cmake's module list, but I tried it and the dnn library is nowhere to be found with the flag on.

There's also a modern-dnn module in contrib now that wraps around tiny dnn (which also uses Protobuf it seems). Could it cause a similar issue?