rbgirshick / py-faster-rcnn

Faster R-CNN (Python implementation) -- see https://github.com/ShaoqingRen/faster_rcnn for the official MATLAB version
Other
8.12k stars 4.11k forks source link

sgd_solver.cu:19] Check failed: error == cudaSuccess (11 vs. 0) invalid argument #726

Open longgang123 opened 6 years ago

longgang123 commented 6 years ago

I chose ubuntu14.04 and gtx1080 as platform cuda8.0 nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2016 NVIDIA Corporation Built on Tue_Jan_10_13:22:03_CST_2017 Cuda compilation tools, release 8.0, V8.0.61 and the version of cudnn is cudnn5.1 when i try to train faster-rcnn on voc2007 us command like this 'sudo python ./tools/train_faster_rcnn_alt_opt.py --gpu 0 --net_name ZF --weights data/imagenet_models/ZF.v2.caffemodel --imdb voc_2007_trainval --cfg experiments/cfgs/faster_rcnn_alt_opt.yml' while i get error at the solving stage: Solving... I1109 10:03:10.649351 11058 solver.cpp:229] Iteration 0, loss = 0.839667 I1109 10:03:10.649390 11058 solver.cpp:245] Train net output #0: rpn_cls_loss = 0.483599 ( 1 = 0.483599 loss) I1109 10:03:10.649396 11058 solver.cpp:245] Train net output #1: rpn_loss_bbox = 0.356067 ( 1 = 0.356067 loss) I1109 10:03:10.649405 11058 sgd_solver.cpp:106] Iteration 0, lr = 0.001 F1109 10:03:10.651279 11058 sgd_solver.cu:19] Check failed: error == cudaSuccess (11 vs. 0) invalid argument Check failure stack trace: I have no idea how to solve this problem dose anybody encounter this error before? it seems the data was ok for train but when the net begin back computing the sgd meet the error at caffe sgd_solver.cu how can i solve this error?

mc-nya commented 6 years ago

I have the same problem. But when compiling caffe, I merged latest version into caffe-fast-rcnn. Is there something wrong in this operation?

longgang123 commented 6 years ago

I don't know.But my caffe is OK with other neural network training for example AlexNet. so i think maybe it was the py-faster-rcnn doesn't support the cuda8.0

YanShuang17 commented 6 years ago

@longgang123 @mikuxworld hello! how did you solved this error?
when i training the end-to-end vgg16 version, i meet the same problem. some infomation of mine: CUDA8.0.61+cuDNN.v6+GTX1080TI.

F0122 10:50:04.063930 28593 sgd_solver.cu:19] Check failed: error == cudaSuccess (11 vs. 0)  invalid argument
*** Check failure stack trace: ***
Aborted (core dumped)

thanks......

Elasine commented 6 years ago

I met the same problem. Some information of mine:CUDA8.0.44+CuDNN.v5+GTX1050ti.Please help me if you know why and how to solve it.Thank you very much.

YanShuang17 commented 6 years ago

it is probably the problem of the version of cuda/cudnn, I solved this issue by changing to cuda8.0.61/cudnn v5.1. but I still don't understand the real reason behind......good luck to you... @Elasine

Elasine commented 6 years ago

I come across the problem again. If you know the real reason,please make me understood. @longgang123,@YanShuang17 thank you very much.

Nhanyu commented 5 years ago

I have the same problem. I can compile caffe under 32-bit and 64-bit windows gpu environments successfully,but when I run the caffe test project in 32-bit windows gpu environments, the problem arises. "F0905 16:19:17.685425 2196 math_functions.cu:79] Check failed: error == cudaSuccess (11 vs. 0) invalid argument" @YanShuang17 @longgang123 ,I hope I can get help from you,thank you very much.