ApolloAuto / apollo

An open autonomous driving platform
Apache License 2.0
25.01k stars 9.67k forks source link

cudnnConvolutionLayer.cpp (254) - Cuda Error in execute: 7 - when apollo.sh test #9221

Open zzq1016 opened 5 years ago

zzq1016 commented 5 years ago

We appreciate you go through Apollo documentations and search previous issues before creating an new one. If neither of the sources helped you with your issues, please report the issue using the following form. Please note missing info can delay the response time. System information

OS Platform and Distribution (e.g., Linux Ubuntu 14.04):
Apollo installed from (source or binary): docker
Apollo version (1.0, 1.5, 2.0, 2.5, 3.0): 5.0

hardware: RTX2080

GPU-driver: nvidia-418.34 Steps to reproduce the issue:

Please use bullet points and include as much details as possible:

after enter docker, I successlly run : bash apollo.sh build_gpu

them, I run bash apollo.sh test, I got some errors:

//cyber/common:file_test FAILED in 0.1s /root/.cache/bazel/_bazel_root/540135163923dd7d5820f3ee4b306b32/execroot/apollo/bazel-out/local-dbg/testlogs/cyber/common/file_test/test.log //modules/perception/camera/test:camera_app_obstacle_camera_perception_test FAILED in 46.8s /root/.cache/bazel/_bazel_root/540135163923dd7d5820f3ee4b306b32/execroot/apollo/bazel-out/local-dbg/testlogs/modules/perception/camera/test/camera_app_obstacle_camera_perception_test/test.log //modules/perception/camera/test:camera_lib_lane_detector_darkscnn_lane_detector_test FAILED in 80.2s /root/.cache/bazel/_bazel_root/540135163923dd7d5820f3ee4b306b32/execroot/apollo/bazel-out/local-dbg/testlogs/modules/perception/camera/test/camera_lib_lane_detector_darkscnn_lane_detector_test/test.log //modules/perception/camera/test:camera_lib_lane_detector_denseline_lane_detector_test FAILED in 19.0s /root/.cache/bazel/_bazel_root/540135163923dd7d5820f3ee4b306b32/execroot/apollo/bazel-out/local-dbg/testlogs/modules/perception/camera/test/camera_lib_lane_detector_denseline_lane_detector_test/test.log //modules/perception/camera/test:camera_lib_lane_postprocessor_darkscnn_lane_postprocessor_test FAILED in 66.0s /root/.cache/bazel/_bazel_root/540135163923dd7d5820f3ee4b306b32/execroot/apollo/bazel-out/local-dbg/testlogs/modules/perception/camera/test/camera_lib_lane_postprocessor_darkscnn_lane_postprocessor_test/test.log //modules/perception/camera/test:camera_lib_lane_postprocessor_denseline_lane_postprocessor_test FAILED in 25.6s /root/.cache/bazel/_bazel_root/540135163923dd7d5820f3ee4b306b32/execroot/apollo/bazel-out/local-dbg/testlogs/modules/perception/camera/test/camera_lib_lane_postprocessor_denseline_lane_postprocessor_test/test.log //modules/perception/camera/test:camera_lib_obstacle_detector_yolo_yolo_obstacle_detector_test FAILED in 30.0s /root/.cache/bazel/_bazel_root/540135163923dd7d5820f3ee4b306b32/execroot/apollo/bazel-out/local-dbg/testlogs/modules/perception/camera/test/camera_lib_obstacle_detector_yolo_yolo_obstacle_detector_test/test.log //modules/perception/camera/test:camera_lib_obstacle_transformer_multicue_multicue_obstacle_transformer_test FAILED in 30.4s /root/.cache/bazel/_bazel_root/540135163923dd7d5820f3ee4b306b32/execroot/apollo/bazel-out/local-dbg/testlogs/modules/perception/camera/test/camera_lib_obstacle_transformer_multicue_multicue_obstacle_transformer_test/test.log

Executed 526 out of 526 tests: 518 tests pass and 8 fail locally.

[ERROR] Test failed! [INFO] Took 142 seconds

the reasons for 7 of 8 fails are similar:

/root/.cache/bazel/_bazel_root/540135163923dd7d5820f3ee4b306b32/execroot/apollo/bazel-out/local-dbg/testlogs/modules/perception/camera/test/camera_lib_lane_postprocessor_darkscnn_lane_postprocessor_test/test.log

show:

I0724 10:26:04.858739 30215 visualizer.cc:187] [] p_fov4 =[1919, 1079] I0724 10:26:04.858742 30215 visualizer.cc:202] [] vanishing point 1:0001024 486.037 I0724 10:26:04.858748 30215 visualizer.cc:203] [] vanishing point 2:00001024 -486.037 I0724 10:26:04.858804 30215 camera_lib_lane_postprocessor_darkscnn_lane_postprocessor_test.cc:196] [] Initilize visualizer finished! I0724 10:26:21.421674 30215 data_provider.cc:146] [] Fill in GPU mode ... I0724 10:26:21.421737 30215 data_provider.cc:186] [] Done! (1) I0724 10:26:21.421804 30215 data_provider.cc:241] [] GetImage ... I0724 10:26:21.421808 30215 data_provider.cc:268] [] cropping ... I0724 10:26:21.421813 30215 data_provider.cc:271] [] Done! F0724 10:26:21.423812 30215 cudnn_conv_layer.cu:28] Check failed: status == CUDNN_STATUS_SUCCESS (7 vs. 0) CUDNN_STATUS_MAPPING_ERROR Check failure stack trace: external/bazel_tools/tools/test/test-setup.sh: line 169: 30215 Aborted "${TEST_PATH}" "$@"


/root/.cache/bazel/_bazel_root/540135163923dd7d5820f3ee4b306b32/execroot/apollo/bazel-out/local-dbg/testlogs/modules/perception/camera/test/camera_lib_obstacle_detector_yolo_yolo_obstacle_detector_test/test.log

show: I0724 10:25:26.465013 30132 rt_net.cc:708] [] Erase output: detect3_ori_pred I0724 10:25:26.465019 30132 rt_net.cc:708] [] Erase output: lof_pred I0724 10:25:26.465023 30132 rt_net.cc:708] [] Erase output: lor_pred I0724 10:25:26.465382 30132 rt_net.cc:643] [] Device Works on FP32 Mode. I0724 10:25:34.005255 30132 rt_net.cc:30] [] cudnnConvolutionLayer.cpp (254) - Cuda Error in execute: 7 I0724 10:25:34.010552 30132 rt_net.cc:30] [] cudnnConvolutionLayer.cpp (254) - Cuda Error in execute: 7 external/bazel_tools/tools/test/test-setup.sh: line 169: 30132 Segmentation fault "${TEST_PATH}" "$@"

thanks for help!

zhouyapengzi commented 4 years ago

I encountered into the same question. Can anyone help?

daohu527 commented 3 years ago

I also meet the problem and want to find the reason.

https://github.com/NVIDIA/TensorRT/issues/851