Open bxiong97 opened 2 years ago
NEW UPDATE:
I tried CUDA_VISIBLE_DEVICES=0 make runtest
instead of make runtest
And I got more test cases past but still got the same issue with CUDA device number = 0.
(base) b***@DESKTOP-****:/mnt/c/Users/bx/caffe-1.0$ CUDA_VISIBLE_DEVICES=0 make runtest
.build_release/tools/caffe
caffe: command line brew
usage: caffe <command> <args>
commands:
train train or finetune a model
test score a model
device_query show GPU diagnostic information
time benchmark model execution time
Flags from tools/caffe.cpp:
-gpu (Optional; run in GPU mode on given device IDs separated by ','.Use
'-gpu all' to run on all available GPUs. The effective training batch
size is multiplied by the number of devices.) type: string default: ""
-iterations (The number of iterations to run.) type: int32 default: 50
-level (Optional; network level.) type: int32 default: 0
-model (The model definition protocol buffer text file.) type: string
default: ""
-phase (Optional; network phase (TRAIN or TEST). Only used for 'time'.)
type: string default: ""
-sighup_effect (Optional; action to take when a SIGHUP signal is received:
snapshot, stop or none.) type: string default: "snapshot"
-sigint_effect (Optional; action to take when a SIGINT signal is received:
snapshot, stop or none.) type: string default: "stop"
-snapshot (Optional; the snapshot solver state to resume training.)
type: string default: ""
-solver (The solver definition protocol buffer text file.) type: string
default: ""
-stage (Optional; network stages (not to be confused with phase), separated
by ','.) type: string default: ""
-weights (Optional; the pretrained weights to initialize finetuning,
separated by ','. Cannot be set simultaneously with snapshot.)
type: string default: ""
.build_release/test/test_all.testbin 0 --gtest_shuffle
Cuda number of devices: 0
Setting to use device 0
Current device id: 0
Current device name:
Note: Randomizing tests' orders with a seed of 54062 .
[==========] Running 2101 tests from 277 test cases.
[----------] Global test environment set-up.
[----------] 3 tests from MSRAFillerTest/0, where TypeParam = float
[ RUN ] MSRAFillerTest/0.TestFillFanIn
E0308 00:11:50.928552 340 common.cpp:114] Cannot create Cublas handle. Cublas won't be available.
E0308 00:11:50.970577 340 common.cpp:121] Cannot create Curand generator. Curand won't be available.
[ OK ] MSRAFillerTest/0.TestFillFanIn (95 ms)
[ RUN ] MSRAFillerTest/0.TestFillAverage
[ OK ] MSRAFillerTest/0.TestFillAverage (0 ms)
[ RUN ] MSRAFillerTest/0.TestFillFanOut
[ OK ] MSRAFillerTest/0.TestFillFanOut (1 ms)
[----------] 3 tests from MSRAFillerTest/0 (96 ms total)
[----------] 12 tests from ArgMaxLayerTest/1, where TypeParam = double
[ RUN ] ArgMaxLayerTest/1.TestCPUMaxValTopK
E0308 00:11:50.980310 340 common.cpp:141] Curand not available. Skipping setting the curand seed.
[ OK ] ArgMaxLayerTest/1.TestCPUMaxValTopK (3 ms)
[ RUN ] ArgMaxLayerTest/1.TestSetupAxisMaxVal
[ OK ] ArgMaxLayerTest/1.TestSetupAxisMaxVal (2 ms)
[ RUN ] ArgMaxLayerTest/1.TestSetupMaxVal
[ OK ] ArgMaxLayerTest/1.TestSetupMaxVal (1 ms)
[ RUN ] ArgMaxLayerTest/1.TestCPUAxisMaxValTopK
[ OK ] ArgMaxLayerTest/1.TestCPUAxisMaxValTopK (18 ms)
[ RUN ] ArgMaxLayerTest/1.TestCPUAxis
[ OK ] ArgMaxLayerTest/1.TestCPUAxis (4 ms)
[ RUN ] ArgMaxLayerTest/1.TestCPUTopK
[ OK ] ArgMaxLayerTest/1.TestCPUTopK (1 ms)
[ RUN ] ArgMaxLayerTest/1.TestCPU
[ OK ] ArgMaxLayerTest/1.TestCPU (1 ms)
[ RUN ] ArgMaxLayerTest/1.TestSetupAxis
[ OK ] ArgMaxLayerTest/1.TestSetupAxis (0 ms)
[ RUN ] ArgMaxLayerTest/1.TestSetupAxisNegativeIndexing
[ OK ] ArgMaxLayerTest/1.TestSetupAxisNegativeIndexing (1 ms)
[ RUN ] ArgMaxLayerTest/1.TestSetup
[ OK ] ArgMaxLayerTest/1.TestSetup (0 ms)
[ RUN ] ArgMaxLayerTest/1.TestCPUMaxVal
[ OK ] ArgMaxLayerTest/1.TestCPUMaxVal (1 ms)
[ RUN ] ArgMaxLayerTest/1.TestCPUAxisTopK
[ OK ] ArgMaxLayerTest/1.TestCPUAxisTopK (19 ms)
[----------] 12 tests from ArgMaxLayerTest/1 (51 ms total)
[----------] 27 tests from ReductionLayerTest/1, where TypeParam = caffe::CPUDevice<double>
[ RUN ] ReductionLayerTest/1.TestMeanCoeff
[ OK ] ReductionLayerTest/1.TestMeanCoeff (7 ms)
[ RUN ] ReductionLayerTest/1.TestAbsSumCoeffAxis1
[ OK ] ReductionLayerTest/1.TestAbsSumCoeffAxis1 (0 ms)
[ RUN ] ReductionLayerTest/1.TestMeanCoeffGradient
[ OK ] ReductionLayerTest/1.TestMeanCoeffGradient (1 ms)
[ RUN ] ReductionLayerTest/1.TestAbsSumCoeffGradient
[ OK ] ReductionLayerTest/1.TestAbsSumCoeffGradient (0 ms)
[ RUN ] ReductionLayerTest/1.TestAbsSumGradient
[ OK ] ReductionLayerTest/1.TestAbsSumGradient (0 ms)
[ RUN ] ReductionLayerTest/1.TestSumOfSquaresCoeff
[ OK ] ReductionLayerTest/1.TestSumOfSquaresCoeff (0 ms)
[ RUN ] ReductionLayerTest/1.TestSumCoeff
[ OK ] ReductionLayerTest/1.TestSumCoeff (0 ms)
[ RUN ] ReductionLayerTest/1.TestSumOfSquaresGradient
[ OK ] ReductionLayerTest/1.TestSumOfSquaresGradient (0 ms)
[ RUN ] ReductionLayerTest/1.TestSumCoeffAxis1
[ OK ] ReductionLayerTest/1.TestSumCoeffAxis1 (0 ms)
[ RUN ] ReductionLayerTest/1.TestSetUpWithAxis2
[ OK ] ReductionLayerTest/1.TestSetUpWithAxis2 (0 ms)
[ RUN ] ReductionLayerTest/1.TestAbsSumCoeffAxis1Gradient
[ OK ] ReductionLayerTest/1.TestAbsSumCoeffAxis1Gradient (1 ms)
[ RUN ] ReductionLayerTest/1.TestAbsSum
[ OK ] ReductionLayerTest/1.TestAbsSum (0 ms)
[ RUN ] ReductionLayerTest/1.TestAbsSumCoeff
[ OK ] ReductionLayerTest/1.TestAbsSumCoeff (0 ms)
[ RUN ] ReductionLayerTest/1.TestSumGradient
[ OK ] ReductionLayerTest/1.TestSumGradient (1 ms)
[ RUN ] ReductionLayerTest/1.TestSumOfSquaresCoeffAxis1Gradient
[ OK ] ReductionLayerTest/1.TestSumOfSquaresCoeffAxis1Gradient (1 ms)
[ RUN ] ReductionLayerTest/1.TestSumOfSquaresCoeffGradient
[ OK ] ReductionLayerTest/1.TestSumOfSquaresCoeffGradient (0 ms)
[ RUN ] ReductionLayerTest/1.TestSum
[ OK ] ReductionLayerTest/1.TestSum (0 ms)
[ RUN ] ReductionLayerTest/1.TestSumOfSquares
[ OK ] ReductionLayerTest/1.TestSumOfSquares (0 ms)
[ RUN ] ReductionLayerTest/1.TestSetUpWithAxis1
[ OK ] ReductionLayerTest/1.TestSetUpWithAxis1 (0 ms)
[ RUN ] ReductionLayerTest/1.TestSumOfSquaresCoeffAxis1
[ OK ] ReductionLayerTest/1.TestSumOfSquaresCoeffAxis1 (0 ms)
[ RUN ] ReductionLayerTest/1.TestMeanCoeffGradientAxis1
[ OK ] ReductionLayerTest/1.TestMeanCoeffGradientAxis1 (1 ms)
[ RUN ] ReductionLayerTest/1.TestMeanGradient
[ OK ] ReductionLayerTest/1.TestMeanGradient (1 ms)
[ RUN ] ReductionLayerTest/1.TestSumCoeffGradient
[ OK ] ReductionLayerTest/1.TestSumCoeffGradient (0 ms)
[ RUN ] ReductionLayerTest/1.TestSumCoeffAxis1Gradient
[ OK ] ReductionLayerTest/1.TestSumCoeffAxis1Gradient (1 ms)
[ RUN ] ReductionLayerTest/1.TestMeanCoeffAxis1
[ OK ] ReductionLayerTest/1.TestMeanCoeffAxis1 (0 ms)
[ RUN ] ReductionLayerTest/1.TestSetUp
[ OK ] ReductionLayerTest/1.TestSetUp (0 ms)
[ RUN ] ReductionLayerTest/1.TestMean
[ OK ] ReductionLayerTest/1.TestMean (0 ms)
[----------] 27 tests from ReductionLayerTest/1 (15 ms total)
[----------] 2 tests from InfogainLossLayerTest/0, where TypeParam = caffe::CPUDevice<float>
[ RUN ] InfogainLossLayerTest/0.TestGradient
F0308 00:11:51.111351 340 cudnn_softmax_layer.cpp:15] Check failed: status == CUDNN_STATUS_SUCCESS (1 vs. 0) CUDNN_STATUS_NOT_INITIALIZED
*** Check failure stack trace: ***
@ 0x7fadd223e1c3 google::LogMessage::Fail()
@ 0x7fadd224325b google::LogMessage::SendToLog()
@ 0x7fadd223debf google::LogMessage::Flush()
@ 0x7fadd223e6ef google::LogMessageFatal::~LogMessageFatal()
@ 0x7fadd0cad050 caffe::CuDNNSoftmaxLayer<>::LayerSetUp()
@ 0x7fadd0cf96b6 caffe::InfogainLossLayer<>::LayerSetUp()
@ 0x555a5f4d2b35 caffe::GradientChecker<>::CheckGradientExhaustive()
@ 0x555a5f6e1dfb caffe::InfogainLossLayerTest_TestGradient_Test<>::TestBody()
@ 0x555a5f9ca211 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x555a5f9c204d testing::Test::Run()
@ 0x555a5f9c2188 testing::TestInfo::Run()
@ 0x555a5f9c2265 testing::TestCase::Run()
@ 0x555a5f9c278c testing::internal::UnitTestImpl::RunAllTests()
@ 0x555a5f9c2857 testing::UnitTest::Run()
@ 0x555a5f49d217 main
@ 0x7fadd07440b3 __libc_start_main
@ 0x555a5f4a4d9e _start
make: *** [Makefile:534: runtest] Aborted
Issue summary
Hi,
I'm installing caffe 1.0 on wsl2 Ubuntu 20.04 I already managed to get
make all
make test
to run without error.However, when I run
make runtest
, I got a bunch of errors.NO CUDA issue
It cannot find CUDA!! But I have CUDA and driver installed. Verified with:
nvcc --version
andnvidia-smi
Makefile.config
System configuration
Could someone please help me with it? I already tried out most solutions I found on internet and no luck. Many thanks!!!!!!