zhaoweicai / mscnn

Caffe implementation of our multi-scale object detection framework
404 stars 211 forks source link

Testing demo crashed at conv1_1 -> conv1_1 cudnn v5 #43

Open junwenchen opened 7 years ago

junwenchen commented 7 years ago

Environment: caffe ubuntu 14.04 cuda 8.0 GTX1080 cudnn v5. I know cudnn v5 is not recommended here but v3 is not widely used now.

Under this environment, I replaced some cudnn relevant .hpp .cu and .cpp file as the modification in the faster-rcnn. Finally, I successfully make and make matcaffe.

However when I am testing the demo, I encountered the following mistake.

I0117 19:48:49.108386 26508 net.cpp:150] Setting up label_4_5x5_data_7_split I0117 19:48:49.108397 26508 net.cpp:157] Top shape: 4 6 9 12 (2592) I0117 19:48:49.108407 26508 net.cpp:157] Top shape: 4 6 9 12 (2592) I0117 19:48:49.108414 26508 net.cpp:165] Memory required for data: 26490268 I0117 19:48:49.108422 26508 layer_factory.hpp:76] Creating layer conv1_1 I0117 19:48:49.108443 26508 net.cpp:106] Creating Layer conv1_1 I0117 19:48:49.108451 26508 net.cpp:454] conv1_1 <- data I0117 19:48:49.108464 26508 net.cpp:411] conv1_1 -> conv1_1 Aborted at 1484653729 (unix time) try "date -d @1484653729" if you are using GNU date PC: @ 0x7f2a6b55eefe caffe::CuDNNConvolutionLayer<>::LayerSetUp() SIGFPE (@0x7f2a6b55eefe) received by PID 26508 (TID 0x7f2a6be5a7c0) from PID 1800793854; stack trace: @ 0x7f2a5a37c330 (unknown) @ 0x7f2a6b55eefe caffe::CuDNNConvolutionLayer<>::LayerSetUp() @ 0x7f2a6b58d85c caffe::Net<>::Init() @ 0x7f2a6b58e905 caffe::Net<>::Net() @ 0x7f2a6b4238ca caffe::Solver<>::InitTrainNet() @ 0x7f2a6b4249dc caffe::Solver<>::Init() @ 0x7f2a6b424ce9 caffe::Solver<>::Solver() @ 0x7f2a6b418823 caffe::Creator_SGDSolver<>() @ 0x40ef9e caffe::SolverRegistry<>::CreateSolver() @ 0x4082db train() @ 0x405f41 main @ 0x7f2a59fc8f45 (unknown) @ 0x4066fd (unknown) @ 0x0 (unknown) Floating point exception (core dumped)

I tried the original matlab version and the python version. They all crashed at the same place. I also tried the training and was also report this error.

Anybody succeed in cudnn v5? can you give me some advice?

Thanks a lot!

baristahell commented 7 years ago

I think, if i recall correctly, that i had this kind of issues and had to switch to cudnn v3 But not sure, it makes a while, wait for someone a bit more enlightened than me to answer i guess

zhaoweicai commented 7 years ago

hi @junwenchen @baristahell The code is merged to the latest caffe. cuDNN V5 is available to use now.