datamachines / cuda_tensorflow_opencv

DockerFile with GPU support for TensorFlow and OpenCV
Apache License 2.0
120 stars 17 forks source link

Build failed - make cudnn_tensorflow_opencv-10.2_2.4.1_4.5.2 #18

Closed jmkfr06 closed 3 years ago

jmkfr06 commented 3 years ago

Hi I would like to build an image with cuda support 10.2 - tf 2.4.1 and opencv 4.5.2 but the build failed when compiling tensorflow. Thanks for your help

[18,609 / 24,292] Compiling tensorflow/core/grappler/optimizers/arithmetic_optimizer.cc [for host]; 16s local ... (12 actions running) [18,875 / 24,292] Compiling tensorflow/core/kernels/conv_2d_gpu_uint8.cu.cc [for host]; 84s local ... (12 actions running) [19,033 / 24,292] Compiling tensorflow/core/kernels/pad_op_gpu.cu.cc [for host]; 189s local ... (12 actions running) [19,153 / 24,292] Compiling tensorflow/core/kernels/pad_op_gpu.cu.cc [for host]; 636s local ... (12 actions running) [19,154 / 24,292] Compiling tensorflow/core/kernels/pad_op_gpu.cu.cc [for host]; 1168s local ... (12 actions, 11 running) ERROR: /usr/local/src/tensorflow/tensorflow/core/kernels/linalg/BUILD:193:18: C++ compilation of rule '//tensorflow/core/kernels/linalg:matrix_square_root_op' failed (Exit 4): crosstool_wrapper_driver_is_not_gcc failed: error executing command (cd /root/.cache/bazel/_bazel_root/bbcc73fcc5c2b01ab08b6bcf7c29e42e/execroot/org_tensorflow && \ exec env - \ LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64 \ PATH=/root/.cache/bazelisk/downloads/bazelbuild/bazel-3.7.2-linux-x86_64/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \ PWD=/proc/self/cwd \ external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -MD -MF bazel-out/host/bin/tensorflow/core/kernels/linalg/_objs/matrix_square_root_op/matrix_square_root_op.pic.d '-frandom-seed=bazel-out/host/bin/tensorflow/core/kernels/linalg/_objs/matrix_square_root_op/matrix_square_root_op.pic.o' -DTENSORFLOW_USE_CUSTOM_CONTRACTION_KERNEL -DTENSORFLOW_USE_MKLDNN_CONTRACTION_KERNEL -DHAVE_SYS_UIO_H -DTF_USE_SNAPPY -DCURL_STATICLIB -DPLATFORM_LINUX -DENABLE_CURL_CLIENT -DOPENSSL_IS_BORINGSSL -DEIGEN_MPL2_ONLY '-DEIGEN_MAX_ALIGN_BYTES=64' '-DEIGEN_HAS_TYPE_TRAITS=0' -DCLANG_SUPPORT_DYN_ANNOTATION -iquote . -iquote bazel-out/host/bin -iquote external/com_google_absl -iquote bazel-out/host/bin/external/com_google_absl -iquote external/nsync -iquote bazel-out/host/bin/external/nsync -iquote external/eigen_archive -iquote bazel-out/host/bin/external/eigen_archive -iquote external/gif -iquote bazel-out/host/bin/external/gif -iquote external/libjpeg_turbo -iquote bazel-out/host/bin/external/libjpeg_turbo -iquote external/com_google_protobuf -iquote bazel-out/host/bin/external/com_google_protobuf -iquote external/com_googlesource_code_re2 -iquote bazel-out/host/bin/external/com_googlesource_code_re2 -iquote external/farmhash_archive -iquote bazel-out/host/bin/external/farmhash_archive -iquote external/fft2d -iquote bazel-out/host/bin/external/fft2d -iquote external/highwayhash -iquote bazel-out/host/bin/external/highwayhash -iquote external/zlib -iquote bazel-out/host/bin/external/zlib -iquote external/local_config_cuda -iquote bazel-out/host/bin/external/local_config_cuda -iquote external/local_config_tensorrt -iquote bazel-out/host/bin/external/local_config_tensorrt -iquote external/double_conversion -iquote bazel-out/host/bin/external/double_conversion -iquote external/snappy -iquote bazel-out/host/bin/external/snappy -iquote external/curl -iquote bazel-out/host/bin/external/curl -iquote external/boringssl -iquote bazel-out/host/bin/external/boringssl -iquote external/jsoncpp_git -iquote bazel-out/host/bin/external/jsoncpp_git -iquote external/aws -iquote bazel-out/host/bin/external/aws -iquote external/aws-c-common -iquote bazel-out/host/bin/external/aws-c-common -iquote external/aws-c-event-stream -iquote bazel-out/host/bin/external/aws-c-event-stream -iquote external/aws-checksums -iquote bazel-out/host/bin/external/aws-checksums -iquote external/mkl_dnn -iquote bazel-out/host/bin/external/mkl_dnn -Ibazel-out/host/bin/external/local_config_cuda/cuda/_virtual_includes/cuda_headers_virtual -Ibazel-out/host/bin/external/local_config_tensorrt/_virtual_includes/tensorrt_headers -Ibazel-out/host/bin/external/local_config_cuda/cuda/_virtual_includes/cudnn_header -Ibazel-out/host/bin/external/local_config_cuda/cuda/_virtual_includes/cublas_headers_virtual -Ibazel-out/host/bin/external/local_config_cuda/cuda/_virtual_includes/cusolver_headers_virtual -isystem external/nsync/public -isystem bazel-out/host/bin/external/nsync/public -isystem external/eigen_archive -isystem bazel-out/host/bin/external/eigen_archive -isystem external/gif -isystem bazel-out/host/bin/external/gif -isystem external/com_google_protobuf/src -isystem bazel-out/host/bin/external/com_google_protobuf/src -isystem external/farmhash_archive/src -isystem bazel-out/host/bin/external/farmhash_archive/src -isystem external/zlib -isystem bazel-out/host/bin/external/zlib -isystem external/local_config_cuda/cuda -isystem bazel-out/host/bin/external/local_config_cuda/cuda -isystem external/local_config_cuda/cuda/cuda/include -isystem bazel-out/host/bin/external/local_config_cuda/cuda/cuda/include -isystem external/double_conversion -isystem bazel-out/host/bin/external/double_conversion -isystem external/curl/include -isystem bazel-out/host/bin/external/curl/include -isystem external/boringssl/src/include -isystem bazel-out/host/bin/external/boringssl/src/include -isystem external/jsoncpp_git/include -isystem bazel-out/host/bin/external/jsoncpp_git/include -isystem external/aws/aws-cpp-sdk-core/include -isystem bazel-out/host/bin/external/aws/aws-cpp-sdk-core/include -isystem external/aws/aws-cpp-sdk-s3/include -isystem bazel-out/host/bin/external/aws/aws-cpp-sdk-s3/include -isystem external/aws/aws-cpp-sdk-transfer/include -isystem bazel-out/host/bin/external/aws/aws-cpp-sdk-transfer/include -isystem external/aws-c-common/include -isystem bazel-out/host/bin/external/aws-c-common/include -isystem external/aws-c-event-stream/include -isystem bazel-out/host/bin/external/aws-c-event-stream/include -isystem external/aws-checksums/include -isystem bazel-out/host/bin/external/aws-checksums/include -isystem external/local_config_cuda/cuda/cublas/include -isystem bazel-out/host/bin/external/local_config_cuda/cuda/cublas/include -isystem external/local_config_cuda/cuda/cusolver/include -isystem bazel-out/host/bin/external/local_config_cuda/cuda/cusolver/include -isystem external/mkl_dnn/include -isystem bazel-out/host/bin/external/mkl_dnn/include -isystem external/mkl_dnn/src -isystem bazel-out/host/bin/external/mkl_dnn/src -isystem external/mkl_dnn/src/common -isystem bazel-out/host/bin/external/mkl_dnn/src/common -isystem external/mkl_dnn/src/cpu -isystem bazel-out/host/bin/external/mkl_dnn/src/cpu -isystem external/mkl_dnn/src/cpu/gemm -isystem bazel-out/host/bin/external/mkl_dnn/src/cpu/gemm -isystem external/mkl_dnn/src/cpu/xbyak -isystem bazel-out/host/bin/external/mkl_dnn/src/cpu/xbyak -Wno-builtin-macro-redefined '-DDATE="redacted"' '-DTIMESTAMP="redacted"' '-DTIME__="redacted"' -fPIC -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -Wall -fno-omit-frame-pointer -no-canonical-prefixes -fno-canonical-system-headers -DNDEBUG -g0 -O2 -ffunction-sections -fdata-sections -g0 -w -Wno-sign-compare -g0 '-std=c++14' -DEIGEN_AVOID_STL_ARRAY -Iexternal/gemmlowp -Wno-sign-compare '-ftemplate-depth=900' -fno-exceptions '-DGOOGLE_CUDA=1' '-DTENSORFLOW_USE_NVCC=1' -msse3 -pthread '-DGOOGLE_CUDA=1' -c tensorflow/core/kernels/linalg/matrix_square_root_op.cc -o bazel-out/host/bin/tensorflow/core/kernels/linalg/_objs/matrix_square_root_op/matrix_square_root_op.pic.o) Execution platform: @local_execution_config_platform//:platform x86_64-linux-gnu-gcc-7: internal compiler error: Killed (program cc1plus) Please submit a full bug report, with preprocessed source if appropriate. See file:///usr/share/doc/gcc-7/README.Bugs for instructions. Target //tensorflow/tools/pip_package:build_pip_package failed to build ERROR: /usr/local/src/tensorflow/tensorflow/tools/pip_package/BUILD:69:10 C++ compilation of rule '//tensorflow/core/kernels/linalg:matrix_square_root_op' failed (Exit 4): crosstool_wrapper_driver_is_not_gcc failed: error executing command (cd /root/.cache/bazel/_bazel_root/bbcc73fcc5c2b01ab08b6bcf7c29e42e/execroot/org_tensorflow && \ exec env - \ LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64 \ PATH=/root/.cache/bazelisk/downloads/bazelbuild/bazel-3.7.2-linux-x86_64/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \ PWD=/proc/self/cwd \ external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -MD -MF bazel-out/host/bin/tensorflow/core/kernels/linalg/_objs/matrix_square_root_op/matrix_square_root_op.pic.d '-frandom-seed=bazel-out/host/bin/tensorflow/core/kernels/linalg/_objs/matrix_square_root_op/matrix_square_root_op.pic.o' -DTENSORFLOW_USE_CUSTOM_CONTRACTION_KERNEL -DTENSORFLOW_USE_MKLDNN_CONTRACTION_KERNEL -DHAVE_SYS_UIO_H -DTF_USE_SNAPPY -DCURL_STATICLIB -DPLATFORM_LINUX -DENABLE_CURL_CLIENT -DOPENSSL_IS_BORINGSSL -DEIGEN_MPL2_ONLY '-DEIGEN_MAX_ALIGN_BYTES=64' '-DEIGEN_HAS_TYPE_TRAITS=0' -DCLANG_SUPPORT_DYN_ANNOTATION -iquote . -iquote bazel-out/host/bin -iquote external/com_google_absl -iquote bazel-out/host/bin/external/com_google_absl -iquote external/nsync -iquote bazel-out/host/bin/external/nsync -iquote external/eigen_archive -iquote bazel-out/host/bin/external/eigen_archive -iquote external/gif -iquote bazel-out/host/bin/external/gif -iquote external/libjpeg_turbo -iquote bazel-out/host/bin/external/libjpeg_turbo -iquote external/com_google_protobuf -iquote bazel-out/host/bin/external/com_google_protobuf -iquote external/com_googlesource_code_re2 -iquote bazel-out/host/bin/external/com_googlesource_code_re2 -iquote external/farmhash_archive -iquote bazel-out/host/bin/external/farmhash_archive -iquote external/fft2d -iquote bazel-out/host/bin/external/fft2d -iquote external/highwayhash -iquote bazel-out/host/bin/external/highwayhash -iquote external/zlib -iquote bazel-out/host/bin/external/zlib -iquote external/local_config_cuda -iquote bazel-out/host/bin/external/local_config_cuda -iquote external/local_config_tensorrt -iquote bazel-out/host/bin/external/local_config_tensorrt -iquote external/double_conversion -iquote bazel-out/host/bin/external/double_conversion -iquote external/snappy -iquote bazel-out/host/bin/external/snappy -iquote external/curl -iquote bazel-out/host/bin/external/curl -iquote external/boringssl -iquote bazel-out/host/bin/external/boringssl -iquote external/jsoncpp_git -iquote bazel-out/host/bin/external/jsoncpp_git -iquote external/aws -iquote bazel-out/host/bin/external/aws -iquote external/aws-c-common -iquote bazel-out/host/bin/external/aws-c-common -iquote external/aws-c-event-stream -iquote bazel-out/host/bin/external/aws-c-event-stream -iquote external/aws-checksums -iquote bazel-out/host/bin/external/aws-checksums -iquote external/mkl_dnn -iquote bazel-out/host/bin/external/mkl_dnn -Ibazel-out/host/bin/external/local_config_cuda/cuda/_virtual_includes/cuda_headers_virtual -Ibazel-out/host/bin/external/local_config_tensorrt/_virtual_includes/tensorrt_headers -Ibazel-out/host/bin/external/local_config_cuda/cuda/_virtual_includes/cudnn_header -Ibazel-out/host/bin/external/local_config_cuda/cuda/_virtual_includes/cublas_headers_virtual -Ibazel-out/host/bin/external/local_config_cuda/cuda/_virtual_includes/cusolver_headers_virtual -isystem external/nsync/public -isystem bazel-out/host/bin/external/nsync/public -isystem external/eigen_archive -isystem bazel-out/host/bin/external/eigen_archive -isystem external/gif -isystem bazel-out/host/bin/external/gif -isystem external/com_google_protobuf/src -isystem bazel-out/host/bin/external/com_google_protobuf/src -isystem external/farmhash_archive/src -isystem bazel-out/host/bin/external/farmhash_archive/src -isystem external/zlib -isystem bazel-out/host/bin/external/zlib -isystem external/local_config_cuda/cuda -isystem bazel-out/host/bin/external/local_config_cuda/cuda -isystem external/local_config_cuda/cuda/cuda/include -isystem bazel-out/host/bin/external/local_config_cuda/cuda/cuda/include -isystem external/double_conversion -isystem bazel-out/host/bin/external/double_conversion -isystem external/curl/include -isystem bazel-out/host/bin/external/curl/include -isystem external/boringssl/src/include -isystem bazel-out/host/bin/external/boringssl/src/include -isystem external/jsoncpp_git/include -isystem bazel-out/host/bin/external/jsoncpp_git/include -isystem external/aws/aws-cpp-sdk-core/include -isystem bazel-out/host/bin/external/aws/aws-cpp-sdk-core/include -isystem external/aws/aws-cpp-sdk-s3/include -isystem bazel-out/host/bin/external/aws/aws-cpp-sdk-s3/include -isystem external/aws/aws-cpp-sdk-transfer/include -isystem bazel-out/host/bin/external/aws/aws-cpp-sdk-transfer/include -isystem external/aws-c-common/include -isystem bazel-out/host/bin/external/aws-c-common/include -isystem external/aws-c-event-stream/include -isystem bazel-out/host/bin/external/aws-c-event-stream/include -isystem external/aws-checksums/include -isystem bazel-out/host/bin/external/aws-checksums/include -isystem external/local_config_cuda/cuda/cublas/include -isystem bazel-out/host/bin/external/local_config_cuda/cuda/cublas/include -isystem external/local_config_cuda/cuda/cusolver/include -isystem bazel-out/host/bin/external/local_config_cuda/cuda/cusolver/include -isystem external/mkl_dnn/include -isystem bazel-out/host/bin/external/mkl_dnn/include -isystem external/mkl_dnn/src -isystem bazel-out/host/bin/external/mkl_dnn/src -isystem external/mkl_dnn/src/common -isystem bazel-out/host/bin/external/mkl_dnn/src/common -isystem external/mkl_dnn/src/cpu -isystem bazel-out/host/bin/external/mkl_dnn/src/cpu -isystem external/mkl_dnn/src/cpu/gemm -isystem bazel-out/host/bin/external/mkl_dnn/src/cpu/gemm -isystem external/mkl_dnn/src/cpu/xbyak -isystem bazel-out/host/bin/external/mkl_dnn/src/cpu/xbyak -Wno-builtin-macro-redefined '-DDATE="redacted"' '-DTIMESTAMP="redacted"' '-DTIME__="redacted"' -fPIC -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -Wall -fno-omit-frame-pointer -no-canonical-prefixes -fno-canonical-system-headers -DNDEBUG -g0 -O2 -ffunction-sections -fdata-sections -g0 -w -Wno-sign-compare -g0 '-std=c++14' -DEIGEN_AVOID_STL_ARRAY -Iexternal/gemmlowp -Wno-sign-compare '-ftemplate-depth=900' -fno-exceptions '-DGOOGLE_CUDA=1' '-DTENSORFLOW_USE_NVCC=1' -msse3 -pthread '-DGOOGLE_CUDA=1' -c tensorflow/core/kernels/linalg/matrix_square_root_op.cc -o bazel-out/host/bin/tensorflow/core/kernels/linalg/_objs/matrix_square_root_op/matrix_square_root_op.pic.o) Execution platform: @local_execution_config_platform//:platform INFO: Elapsed time: 3964.637s, Critical Path: 1182.82s INFO: 19166 processes: 8754 internal, 10412 local. FAILED: Build did NOT complete successfully FAILED: Build did NOT complete successfully Command exited with non-zero status 1 0.30user 3.15system 1:06:07elapsed 0%CPU (0avgtext+0avgdata 12632maxresident)k 10141048inputs+40outputs (42403major+19178minor)pagefaults 0swaps

mmartial commented 3 years ago

The error seems to be ERROR: /usr/local/src/tensorflow/tensorflow/tools/pip_package/BUILD:69:10 C++ compilation of rule '//tensorflow/core/kernels/linalg:matrix_square_root_op' failed (Exit 4): crosstool_wrapper_driver_is_not_gcc failed: error executing command

The crosstool_wrapper_driver_is_not_gcc failed seems strange to me. When I look it up some people seemed to think that their build was running low on memory, but I am unclear, maybe use a lower number for nproc.

Also, When I look at your build version, we have a pre-built container with those parameters available on DockerHub, see https://github.com/datamachines/cuda_tensorflow_opencv/blob/master/Builds-DockerHub.md for details on the build parameters for 10.2_2.4.1_4.5.2-20210414

mmartial commented 3 years ago

I will note that I am encountering a similar enough error to yours (crosstool_wrapper_driver_is_not_gcc) when building TF 2.5.0 with CUDA 10.2 See https://github.com/tensorflow/tensorflow/issues/49983

Can you give me a little more details as to your gcc version and other build environments? I wonder if there is a version of a software package that might be causing this.

jmkfr06 commented 3 years ago

Thanks for your reply. I've been able to build the image by increasing the amount of swap. My computer was getting low in memory (16Gb RAM 2GbSwap). I've increased the swap to 8gb.