BVLC / caffe

Caffe: a fast open framework for deep learning.
http://caffe.berkeleyvision.org/
Other
34.06k stars 18.7k forks source link

CUDNN_STATUS_EXECUTION_FAILED on make runtest #6969

Open za13 opened 4 years ago

za13 commented 4 years ago

I tried all the the below for compiling:

cp Makefile.config.example Makefile.config
make all
make test
make runtest

all work except when I get to make runtest, I get the error

F0911 17:48:42.539149 17301 cudnn_softmax_layer.cu:20] Check failed: status == CUDNN_STATUS_SUCCESS (8 vs. 0)  CUDNN_STATUS_EXECUTION_FAILED
*** Check failure stack trace: ***
    @     0x7fe3945815cd  google::LogMessage::Fail()
    @     0x7fe394583433  google::LogMessage::SendToLog()
    @     0x7fe39458115b  google::LogMessage::Flush()
    @     0x7fe394583e1e  google::LogMessageFatal::~LogMessageFatal()
    @     0x7fe392f4ee06  caffe::CuDNNSoftmaxLayer<>::Forward_gpu()
    @     0x7fe392f3c5da  caffe::SoftmaxWithLossLayer<>::Forward_gpu()
    @     0x7fe392dc4201  caffe::Net<>::ForwardFromTo()
    @     0x7fe392dc4307  caffe::Net<>::Forward()
    @           0x771b77  caffe::NetTest_TestBackwardWithAccuracyLayer_Test<>::TestBody()
    @           0x956663  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @           0x94fc5a  testing::Test::Run()
    @           0x94fda8  testing::TestInfo::Run()
    @           0x94fe85  testing::TestCase::Run()
    @           0x95115f  testing::internal::UnitTestImpl::RunAllTests()
    @           0x951483  testing::UnitTest::Run()
    @           0x470fcd  main
    @     0x7fe3920d2840  __libc_start_main
    @           0x478f79  _start
    @              (nil)  (unknown)
Makefile:542: recipe for target 'runtest' failed
make: *** [runtest] Aborted (core dumped)

I was able to get all the steps above to work properly before, but I then removed my previous GPU and inserted a new one and tried the steps again. I then got that error above when trying make runtest

can anyone help?

wei0wei0 commented 1 year ago

I have encounter the same problem,have you solve the problem?