Closed EileenSchreiber closed 6 years ago
This seems to be an OpenCV GPU usage issue, not Caffe. I searched online, and find this thread. Try set the following environment variable to null:
OPENCV_OPENCL_RUNTIME=
If it doesn't work, try do make runtest
after make all
. This will make sure you have opencv3 compiled correctly with caffe.
Overall, this is an opencv issue. It is very complicated when came to incompatible issue with opencv and caffe. Try different versions of opencv, cudnn, and make clean
before you make again.
Thank you for your fast reply. I tried doing make runtest
now but I get a lot of different errors now.
No.1:
[----------] 5 tests from ImageDataLayerTest/1, where TypeParam = caffe::CPUDevice
No.2:
[----------] 1 test from LayerFactoryTest/1, where TypeParam = caffe::CPUDevice
No.3:
[----------] 5 tests from MemoryDataLayerTest/0, where TypeParam = caffe::CPUDevice
No.4:
[----------] 1 test from HardSpatialTransformerLayerTest/1, where TypeParam = caffe::CPUDevice
Do you think all of them have to do with OpenCV3 not really being compiled?
Thank you for your help.
I encountered error 4 before, it is ok. The code still runs without any problem. Error 2 is about ldb package, maybe you can reinstall ldb.
The first and third error is indeed with opencv3. I think there is some problem in your compilation of opencv3 with GPU support. Just try make clean, and compile opencv again.
Thanks again for all your help.
It is finally working up to the point that an out of memory error occurs.
So I was trying to reduce the batch_size
in Hidden-Two-Stream/models/ucf101_split1_unsup_end/end_train_val.prototxt
, but it tells me: "Mask batch size not the same as input batch size".
So I was wondering, where else do I have to change the batch size?
Thank you in advance.
Hi great to know that the code finally works.
You also need to change the batch size here from line 241 to line 444. For each mask I defined, I hard coded it to be 8. Since you have the out of memory issue, you need to change these params as well. Sorry for the hard code, I don't know how to generate .prototxt using python at that time. Hope this helps.
Thank you so much for all your help.
Until now it's training. :+1:
The Error occurs while starting the training.
Training data: UCF 101
System: Cuda 8.0 Cudnn 5.1 Ubuntu 16.04 OpenCV 3.3 GPU = 2 x nvidia 1060
layer { name: "FlowDeltasUClean6" type: "Concat" bottom: "FlowDeltasUClean6_0" bottom: "FlowDeltasUClean6_1" bottom: "FlowDeltasUClean6_2" bottom: "FlowDeltasUClean6_3" bottom: "FlowDeltasUClean6_4" bottom: I0314 16:56:57.780658 12026 layer_factory.hpp:77] Creating layer data I0314 16:56:57.780689 12026 net.cpp:91] Creating Layer data I0314 16:56:57.780700 12026 net.cpp:400] data -> data I0314 16:56:57.780725 12026 net.cpp:400] data -> label I0314 16:56:57.780798 12026 multi_frame_data_layer.cpp:33] Opening file: ./train_rgb_split1.txt I0314 16:56:57.785921 12026 multi_frame_datalayer.cpp:49] A total of 9537 videos. Aborted at 1521043017 (unix time) try "date -d @1521043017" if you are using GNU date PC: @ 0x7fc548430e67 cv::findDecoder() SIGSEGV (@0x49) received by PID 12026 (TID 0x7fc558e71b00) from PID 73; stack trace: @ 0x7fc55274e4b0 (unknown) @ 0x7fc548430e67 cv::findDecoder() @ 0x7fc548431a01 cv::imread() @ 0x7fc548433e03 cv::imread() @ 0x7fc557d4ef31 caffe::ReadSegmentMultiRGBToDatum() @ 0x7fc557bcb49e caffe::MultiFrameDataLayer<>::DataLayerSetUp() @ 0x7fc557b42753 caffe::BasePrefetchingDataLayer<>::LayerSetUp() @ 0x7fc557afaac2 caffe::Net<>::Init() @ 0x7fc557afc2e1 caffe::Net<>::Net() @ 0x7fc557adbc3a caffe::Solver<>::InitTrainNet() @ 0x7fc557adcf77 caffe::Solver<>::Init() @ 0x7fc557add31a caffe::Solver<>::Solver() @ 0x7fc557d06183 caffe::Creator_SGDSolver<>() @ 0x40a728 train() @ 0x4075e8 main @ 0x7fc552739830 __libc_start_main @ 0x407d59 _start @ 0x0 (unknown) Segmentation fault
Thanks in advance