marian-nmt / marian

Fast Neural Machine Translation in C++
https://marian-nmt.github.io
Other
1.21k stars 227 forks source link

Installation errors - CMake Error at marian_cuda_generated_prod.cu.o.Release.cmake #392

Closed AmitMY closed 2 years ago

AmitMY commented 2 years ago

Bug description

Fail to compile marian

Context

Trying to install mozilla/firefox-translations-training, i get marian compilation errors.

GPU: RTX 2080Ti, GTX 1080 (tried both)

How to reproduce

Here's an exact Dockerfile:

FROM nvidia/cuda:11.2.0-devel-ubuntu18.04

RUN apt-get update; exit 0 # For some reason, fails the first time, but is necessary
RUN apt-get update
RUN apt-get install -y git

RUN git clone https://github.com/mozilla/firefox-translations-training.git
WORKDIR firefox-translations-training

RUN chmod +x ./pipeline/setup/ -R
RUN ./pipeline/setup/install-deps.sh

RUN make conda
RUN make snakemake
RUN make git-modules

RUN apt-get install nvidia-cuda-toolkit -y # Marian requires this toolkit
RUN make dry-run

# Marian/NCCL is expecting nvcc in specific location
RUN mkdir /usr/local/cuda/bin/ && ln -s /usr/bin/nvcc /usr/local/cuda/bin/nvcc
# NCCL: unsupported GNU version! gcc versions later than 6 are not supported!
RUN update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-6 60 --slave /usr/bin/g++ g++ /usr/bin/g++-6
# Downloads and compiles additional packages
RUN make test; make test; exit 0

You can build and run it. If you perform changes to the running machine in order to debug it, then make test should compile marian again.

Logs

cat /data/logs/ru-en/test/compile-marian-dev.log ``` [ 70%] Building CXX object src/3rd_party/sentencepiece/src/CMakeFiles/spm_normalize.dir/spm_normalize_main.cc.o [ 70%] Building CXX object src/3rd_party/sentencepiece/src/CMakeFiles/spm_train.dir/spm_train_main.cc.o [ 70%] Building CXX object src/3rd_party/sentencepiece/src/CMakeFiles/spm_decode.dir/spm_decode_main.cc.o [ 71%] Building CXX object src/3rd_party/sentencepiece/src/CMakeFiles/spm_encode.dir/spm_encode_main.cc.o make[3]: Leaving directory '/firefox-translations-training/3rd_party/marian-dev/build' [ 71%] Built target fbgemm make[3]: Entering directory '/firefox-translations-training/3rd_party/marian-dev/build' Consolidate compiler generated dependencies of target spm_export_vocab make[3]: Leaving directory '/firefox-translations-training/3rd_party/marian-dev/build' make[3]: Entering directory '/firefox-translations-training/3rd_party/marian-dev/build' [ 71%] Building CXX object src/3rd_party/sentencepiece/src/CMakeFiles/spm_export_vocab.dir/spm_export_vocab_main.cc.o [ 71%] Linking CXX executable ../../../../spm_train [ 71%] Linking CXX executable ../../../../spm_export_vocab [ 71%] Linking CXX executable ../../../../spm_normalize [ 71%] Linking CXX executable ../../../../spm_decode make[3]: Leaving directory '/firefox-translations-training/3rd_party/marian-dev/build' [ 71%] Built target spm_export_vocab make[3]: Leaving directory '/firefox-translations-training/3rd_party/marian-dev/build' [ 71%] Built target spm_train make[3]: Leaving directory '/firefox-translations-training/3rd_party/marian-dev/build' [ 71%] Built target spm_normalize make[3]: Leaving directory '/firefox-translations-training/3rd_party/marian-dev/build' [ 71%] Built target spm_decode /firefox-translations-training/3rd_party/marian-dev/src/functional/operators.h(586): error: no operator "/" matches these operands operand types are: __half2 / __half2 /firefox-translations-training/3rd_party/marian-dev/src/functional/operators.h(586): error: no operator "/" matches these operands operand types are: __half2 / __half2 /firefox-translations-training/3rd_party/marian-dev/src/functional/operators.h(586): error: no operator "/" matches these operands operand types are: __half2 / __half2 /firefox-translations-training/3rd_party/marian-dev/src/functional/operators.h(586): error: no operator "/" matches these operands operand types are: __half2 / __half2 /firefox-translations-training/3rd_party/marian-dev/src/functional/operators.h(586): error: no operator "/" matches these operands operand types are: __half2 / __half2 /firefox-translations-training/3rd_party/marian-dev/src/functional/operators.h(586): error: no operator "/" matches these operands operand types are: __half2 / __half2 [ 71%] Linking CXX executable ../../../../spm_encode 1 error detected in the compilation of "/tmp/tmpxft_00008670_00000000-8_topk.compute_61.cpp1.ii". CMake Error at marian_cuda_generated_topk.cu.o.Release.cmake:276 (message): Error generating file /firefox-translations-training/3rd_party/marian-dev/build/src/CMakeFiles/marian_cuda.dir/tensors/gpu/./marian_cuda_generated_topk.cu.o src/CMakeFiles/marian_cuda.dir/build.make:1830: recipe for target 'src/CMakeFiles/marian_cuda.dir/tensors/gpu/marian_cuda_generated_topk.cu.o' failed make[3]: *** [src/CMakeFiles/marian_cuda.dir/tensors/gpu/marian_cuda_generated_topk.cu.o] Error 1 make[3]: *** Waiting for unfinished jobs.... /firefox-translations-training/3rd_party/marian-dev/src/functional/operators.h(586): error: no operator "/" matches these operands operand types are: __half2 / __half2 make[3]: Leaving directory '/firefox-translations-training/3rd_party/marian-dev/build' 1 error detected in the compilation of "/tmp/tmpxft_000086a3_00000000-8_algorithm.compute_61.cpp1.ii". [ 71%] Built target spm_encode CMake Error at marian_cuda_generated_algorithm.cu.o.Release.cmake:276 (message): Error generating file /firefox-translations-training/3rd_party/marian-dev/build/src/CMakeFiles/marian_cuda.dir/tensors/gpu/./marian_cuda_generated_algorithm.cu.o src/CMakeFiles/marian_cuda.dir/build.make:846: recipe for target 'src/CMakeFiles/marian_cuda.dir/tensors/gpu/marian_cuda_generated_algorithm.cu.o' failed make[3]: *** [src/CMakeFiles/marian_cuda.dir/tensors/gpu/marian_cuda_generated_algorithm.cu.o] Error 1 1 error detected in the compilation of "/tmp/tmpxft_00008696_00000000-8_add.compute_61.cpp1.ii". CMake Error at marian_cuda_generated_add.cu.o.Release.cmake:276 (message): Error generating file /firefox-translations-training/3rd_party/marian-dev/build/src/CMakeFiles/marian_cuda.dir/tensors/gpu/./marian_cuda_generated_add.cu.o src/CMakeFiles/marian_cuda.dir/build.make:2639: recipe for target 'src/CMakeFiles/marian_cuda.dir/tensors/gpu/marian_cuda_generated_add.cu.o' failed make[3]: *** [src/CMakeFiles/marian_cuda.dir/tensors/gpu/marian_cuda_generated_add.cu.o] Error 1 1 error detected in the compilation of "/tmp/tmpxft_0000867c_00000000-8_element.compute_61.cpp1.ii". CMake Error at marian_cuda_generated_element.cu.o.Release.cmake:276 (message): Error generating file /firefox-translations-training/3rd_party/marian-dev/build/src/CMakeFiles/marian_cuda.dir/tensors/gpu/./marian_cuda_generated_element.cu.o src/CMakeFiles/marian_cuda.dir/build.make:2226: recipe for target 'src/CMakeFiles/marian_cuda.dir/tensors/gpu/marian_cuda_generated_element.cu.o' failed make[3]: *** [src/CMakeFiles/marian_cuda.dir/tensors/gpu/marian_cuda_generated_element.cu.o] Error 1 1 error detected in the compilation of "/tmp/tmpxft_00008689_00000000-8_add_all.compute_61.cpp1.ii". CMake Error at marian_cuda_generated_add_all.cu.o.Release.cmake:276 (message): Error generating file /firefox-translations-training/3rd_party/marian-dev/build/src/CMakeFiles/marian_cuda.dir/tensors/gpu/./marian_cuda_generated_add_all.cu.o src/CMakeFiles/marian_cuda.dir/build.make:3055: recipe for target 'src/CMakeFiles/marian_cuda.dir/tensors/gpu/marian_cuda_generated_add_all.cu.o' failed make[3]: *** [src/CMakeFiles/marian_cuda.dir/tensors/gpu/marian_cuda_generated_add_all.cu.o] Error 1 1 error detected in the compilation of "/tmp/tmpxft_000086d8_00000000-8_tensor_operators.compute_61.cpp1.ii". CMake Error at marian_cuda_generated_tensor_operators.cu.o.Release.cmake:276 (message): Error generating file /firefox-translations-training/3rd_party/marian-dev/build/src/CMakeFiles/marian_cuda.dir/tensors/gpu/./marian_cuda_generated_tensor_operators.cu.o src/CMakeFiles/marian_cuda.dir/build.make:3646: recipe for target 'src/CMakeFiles/marian_cuda.dir/tensors/gpu/marian_cuda_generated_tensor_operators.cu.o' failed make[3]: *** [src/CMakeFiles/marian_cuda.dir/tensors/gpu/marian_cuda_generated_tensor_operators.cu.o] Error 1 1 error detected in the compilation of "/tmp/tmpxft_000086b8_00000000-8_helpers.compute_61.cpp1.ii". CMake Error at marian_cuda_generated_helpers.cu.o.Release.cmake:276 (message): Error generating file /firefox-translations-training/3rd_party/marian-dev/build/src/CMakeFiles/marian_cuda.dir/translator/./marian_cuda_generated_helpers.cu.o src/CMakeFiles/marian_cuda.dir/build.make:5012: recipe for target 'src/CMakeFiles/marian_cuda.dir/translator/marian_cuda_generated_helpers.cu.o' failed make[3]: *** [src/CMakeFiles/marian_cuda.dir/translator/marian_cuda_generated_helpers.cu.o] Error 1 /firefox-translations-training/3rd_party/marian-dev/src/common/types.h(256): error: calling a __device__ function("operator float") from a __host__ function("operator<<") is not allowed /firefox-translations-training/3rd_party/marian-dev/src/common/types.h(258): error: calling a __device__ function("operator float") from a __host__ function("operator<<") is not allowed 2 errors detected in the compilation of "/tmp/tmpxft_00008667_00000000-4_device.cpp4.ii". CMake Error at marian_cuda_generated_device.cu.o.Release.cmake:276 (message): Error generating file /firefox-translations-training/3rd_party/marian-dev/build/src/CMakeFiles/marian_cuda.dir/tensors/gpu/./marian_cuda_generated_device.cu.o src/CMakeFiles/marian_cuda.dir/build.make:435: recipe for target 'src/CMakeFiles/marian_cuda.dir/tensors/gpu/marian_cuda_generated_device.cu.o' failed make[3]: *** [src/CMakeFiles/marian_cuda.dir/tensors/gpu/marian_cuda_generated_device.cu.o] Error 1 /firefox-translations-training/3rd_party/marian-dev/src/common/types.h(256): error: calling a __device__ function("operator float") from a __host__ function("operator<<") is not allowed /firefox-translations-training/3rd_party/marian-dev/src/common/types.h(258): error: calling a __device__ function("operator float") from a __host__ function("operator<<") is not allowed 2 errors detected in the compilation of "/tmp/tmpxft_00008687_00000000-4_cudnn_wrappers.cpp4.ii". CMake Error at marian_cuda_generated_cudnn_wrappers.cu.o.Release.cmake:276 (message): Error generating file /firefox-translations-training/3rd_party/marian-dev/build/src/CMakeFiles/marian_cuda.dir/tensors/gpu/./marian_cuda_generated_cudnn_wrappers.cu.o src/CMakeFiles/marian_cuda.dir/build.make:4032: recipe for target 'src/CMakeFiles/marian_cuda.dir/tensors/gpu/marian_cuda_generated_cudnn_wrappers.cu.o' failed make[3]: *** [src/CMakeFiles/marian_cuda.dir/tensors/gpu/marian_cuda_generated_cudnn_wrappers.cu.o] Error 1 /firefox-translations-training/3rd_party/marian-dev/src/common/types.h(256): error: calling a __device__ function("operator float") from a __host__ function("operator<<") is not allowed /firefox-translations-training/3rd_party/marian-dev/src/common/types.h(258): error: calling a __device__ function("operator float") from a __host__ function("operator<<") is not allowed 2 errors detected in the compilation of "/tmp/tmpxft_0000865c_00000000-4_nth_element.cpp4.ii". CMake Error at marian_cuda_generated_nth_element.cu.o.Release.cmake:276 (message): Error generating file /firefox-translations-training/3rd_party/marian-dev/build/src/CMakeFiles/marian_cuda.dir/translator/./marian_cuda_generated_nth_element.cu.o src/CMakeFiles/marian_cuda.dir/build.make:4419: recipe for target 'src/CMakeFiles/marian_cuda.dir/translator/marian_cuda_generated_nth_element.cu.o' failed make[3]: *** [src/CMakeFiles/marian_cuda.dir/translator/marian_cuda_generated_nth_element.cu.o] Error 1 /firefox-translations-training/3rd_party/marian-dev/src/common/types.h(256): error: calling a __device__ function("operator float") from a __host__ function("operator<<") is not allowed /firefox-translations-training/3rd_party/marian-dev/src/common/types.h(258): error: calling a __device__ function("operator float") from a __host__ function("operator<<") is not allowed 2 errors detected in the compilation of "/tmp/tmpxft_000086c0_00000000-4_prod.cpp4.ii". CMake Error at marian_cuda_generated_prod.cu.o.Release.cmake:276 (message): Error generating file /firefox-translations-training/3rd_party/marian-dev/build/src/CMakeFiles/marian_cuda.dir/tensors/gpu/./marian_cuda_generated_prod.cu.o src/CMakeFiles/marian_cuda.dir/build.make:1419: recipe for target 'src/CMakeFiles/marian_cuda.dir/tensors/gpu/marian_cuda_generated_prod.cu.o' failed ```
AmitMY commented 2 years ago

I was using docker rather than nvidia-docker, which caused this.