BVLC / caffe

Caffe: a fast open framework for deep learning.
http://caffe.berkeleyvision.org/
Other
34.03k stars 18.7k forks source link

DeepLab v2 runtest error : BatchNormLayerTest/1.TestGradient #5917

Open xiaozai opened 7 years ago

xiaozai commented 7 years ago

When I try to compile the DeepLab v2, the make all and make test passed, but make runtest failed.

the output information is below:

[----------] 3 tests from BatchNormLayerTest/1, where TypeParam = caffe::CPUDevice [ RUN ] BatchNormLayerTest/1.TestGradient F0912 14:10:19.481916 15692 blob.cpp:163] Check failed: data_ Check failure stack trace: @ 0x2b746d778daa (unknown) @ 0x2b746d778ce4 (unknown) @ 0x2b746d7786e6 (unknown) @ 0x2b746d77b687 (unknown) @ 0x2b746f9ff5d9 caffe::Blob<>::mutable_cpu_data() @ 0x2b746fb58058 caffe::BatchNormLayer<>::Forward_cpu() @ 0x48618d caffe::Layer<>::Forward() @ 0x4b7b52 caffe::GradientChecker<>::CheckGradientSingle() @ 0x4babe3 caffe::GradientChecker<>::CheckGradientExhaustive() @ 0x80b34f caffe::BatchNormLayerTest_TestGradient_Test<>::TestBody() @ 0x9040fa testing::internal::HandleExceptionsInMethodIfSupported<>() @ 0x8f97d9 testing::Test::Run() @ 0x8f98b7 testing::TestInfo::Run() @ 0x8f99f5 testing::TestCase::Run() @ 0x8f9c6d testing::internal::UnitTestImpl::RunAllTests() @ 0x903c7a testing::internal::HandleExceptionsInMethodIfSupported<>() @ 0x8f9031 testing::UnitTest::Run() @ 0x471167 main @ 0x2b7470af9f45 (unknown) @ 0x478b49 (unknown) @ (nil) (unknown) make: *** [runtest] Aborted (core dumped)

I tried to compile it on different computer: 1) Titan Xp, Cuda 8.0, cudnn 5.0, openCV 3.1 2) Nvidia 1050Ti, Cuda 8.0, cudnn 4.0, opencv 2.4

but they have the same error.

How to solve it ? please help! Thanks

zanbri commented 6 years ago

I'm running into a similar issue -- any luck finding a solution?

xiaozai commented 6 years ago

Hi, @zanbri

I already solved it. Finally I found it is because of the version error of GPU driver and CUDA. At last I ust the GPU driver 375.26 and for CUDA is cuda_8.0.61_375.26_linux.run and cuda_8.0.61.2_linux.run. Then everything goes well.

If you met the error, such as CUDA Success 30 VS. 0, something like this, it might be the version error of the GPU driver and the CUDA. Check it.

Sometimes if the runtest does not pass, it is also okay, you can still use the DeepLab as normal.

Good luck!

shaibagon commented 6 years ago

@xiaozai If this issue is indeed solved by fixing the GPU driver, can you close this issue please?

Thank you very much, -Shai

ThienAnh commented 6 years ago

Hi @xiaozai . I'm using Nvidia Ti 1050 with 4G . CUDA 8.0, GPU driver 375.82. So i have same issue. BatchNormLayerTest/0.TestForwardInplace F1011 16:52:46.495434 22841 blob.cpp:163] Check failed: data_ Check failure stack trace: @ 0x7f513fa915cd google::LogMessage::Fail() @ 0x7f513fa93433 google::LogMessage::SendToLog() @ 0x7f513fa9115b google::LogMessage::Flush() @ 0x7f513fa93e1e google::LogMessageFatal::~LogMessageFatal() @ 0x7f513af4910b caffe::Blob<>::mutable_cpu_data() @ 0x7f513b046a47 caffe::BatchNormLayer<>::Forward_cpu() @ 0x47a9f2 caffe::Layer<>::Forward() @ 0x6333c7 caffe::BatchNormLayerTest_TestForwardInplace_Test<>::TestBody() @ 0x8b6e73 testing::internal::HandleExceptionsInMethodIfSupported<>() @ 0x8b048a testing::Test::Run() @ 0x8b05d8 testing::TestInfo::Run() @ 0x8b06b5 testing::TestCase::Run() @ 0x8b198f testing::internal::UnitTestImpl::RunAllTests() @ 0x8b1cb3 testing::UnitTest::Run() @ 0x46649d main @ 0x7f513a255830 __libc_start_main @ 0x46d829 _start @ (nil) (unknown) Makefile:526: recipe for target 'runtest' failed make: *** [runtest] Aborted (core dumped)

Have you any solution?

Thanks

vhik4596 commented 5 years ago

Sorry, I am facing the same problem here, I am using 1080Ti which can't install GPU driver 375.26. Does anyone have an answer to solve this?