chengyangfu / retinamask

RetinaMask
MIT License
339 stars 52 forks source link

Issue with the build process #5

Closed antran89 closed 5 years ago

antran89 commented 5 years ago

🐛 Bug

I try to build the code inside a NVIDIA docker contaner.

gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -m64 -fPIC -m64 -fPIC -fPIC -DWITH_CUDA -I/home/jovyan/map-workspace/code/retinamask/maskrcnn_benchmark/csrc -I/opt/conda/lib/python3.6/site-packages/torch/lib/include -I/opt/conda/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -I/opt/conda/lib/python3.6/site-packages/torch/lib/include/TH -I/opt/conda/lib/python3.6/site-packages/torch/lib/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.6m -c /home/jovyan/map-workspace/code/retinamask/maskrcnn_benchmark/csrc/cpu/ROIAlign_cpu.cpp -o build/temp.linux-x86_64-3.6/home/jovyan/map-workspace/code/retinamask/maskrcnn_benchmark/csrc/cpu/ROIAlign_cpu.o -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
/usr/local/cuda/bin/nvcc -DWITH_CUDA -I/home/jovyan/map-workspace/code/retinamask/maskrcnn_benchmark/csrc -I/opt/conda/lib/python3.6/site-packages/torch/lib/include -I/opt/conda/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -I/opt/conda/lib/python3.6/site-packages/torch/lib/include/TH -I/opt/conda/lib/python3.6/site-packages/torch/lib/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.6m -c /home/jovyan/map-workspace/code/retinamask/maskrcnn_benchmark/csrc/cuda/SigmoidFocalLoss_cuda.cu -o build/temp.linux-x86_64-3.6/home/jovyan/map-workspace/code/retinamask/maskrcnn_benchmark/csrc/cuda/SigmoidFocalLoss_cuda.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --compiler-options '-fPIC' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
In file included from /home/jovyan/map-workspace/code/retinamask/maskrcnn_benchmark/csrc/cuda/SigmoidFocalLoss_cuda.cu:6:0:
/opt/conda/lib/python3.6/site-packages/torch/lib/include/ATen/cuda/CUDAContext.h:12:22: fatal error: cusparse.h: No such file or directory
compilation terminated.

To Reproduce

Steps to reproduce the behavior:

  1. python setup.py build develop

Expected behavior

Should compile the C++ code.

Environment

Please copy and paste the output from the environment collection script from PyTorch (or fill out the checklist below manually).

You can get the script and run it with:

wget https://raw.githubusercontent.com/pytorch/pytorch/master/torch/utils/collect_env.py
# For security purposes, please check the contents of collect_env.py before running it.
python collect_env.py

Collecting environment information... PyTorch version: 1.0.1.post2 Is debug build: No CUDA used to build PyTorch: 9.0.176

OS: Ubuntu 16.04.5 LTS GCC version: (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609 CMake version: Could not collect

Python version: 3.6 Is CUDA available: Yes CUDA runtime version: 9.0.176 GPU models and configuration: GPU 0: Tesla V100-SXM2-16GB Nvidia driver version: 390.46 cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.2.1

Versions of relevant libraries: [pip] numpy==1.13.3 [pip] torch==1.0.1.post2 [pip] torchsummary==1.5.1 [pip] torchvision==0.2.2.post3 [conda] torch 1.0.1.post2 [conda] torchsummary 1.5.1 [conda] torchvision 0.2.2.post3

Additional context

chengyangfu commented 5 years ago

Hi, The error looks like occurring during compilation of SigmoidFocalLoss.cu. It is weird because I didn't use any special library here. Are you able to build the main original branch, the [maskrcnn-benchmark] (https://github.com/facebookresearch/maskrcnn-benchmark)?

antran89 commented 5 years ago

Thanks @chengyangfu. I solved the issue.