Closed manueldiaz96 closed 5 years ago
This is the output of nvcc -V
:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176
It seems like even though I installed the cudatoolkit-9.2
conda package which came with cudnn-7.3.1
which I thought would resolve the problem according to this, the server only has CUDA 9.0 installed, and I don't think I have privileges to change the CUDA version.
Does anyone know if what I am thinking is correct or not?
Thanks!
Found a way to compile it using CUDA 9.2 and gcc 6.4.0, output of gcc --version; nvcc -V
:
gcc (GCC) 6.4.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Wed_Apr_11_23:16:29_CDT_2018
Cuda compilation tools, release 9.2, V9.2.88
But when I am trying the Inference in a few lines example from the demo directory, this happens when I try from predictor import COCODemo
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/mdiaz/Internship/mrcnn/maskrcnn-benchmark/demo/predictor.py", line 6, in <module>
from maskrcnn_benchmark.modeling.detector import build_detection_model
File "/home/mdiaz/Internship/mrcnn/maskrcnn-benchmark/maskrcnn_benchmark/modeling/detector/__init__.py", line 2, in <module>
from .detectors import build_detection_model
File "/home/mdiaz/Internship/mrcnn/maskrcnn-benchmark/maskrcnn_benchmark/modeling/detector/detectors.py", line 2, in <module>
from .generalized_rcnn import GeneralizedRCNN
File "/home/mdiaz/Internship/mrcnn/maskrcnn-benchmark/maskrcnn_benchmark/modeling/detector/generalized_rcnn.py", line 11, in <module>
from ..backbone import build_backbone
File "/home/mdiaz/Internship/mrcnn/maskrcnn-benchmark/maskrcnn_benchmark/modeling/backbone/__init__.py", line 2, in <module>
from .backbone import build_backbone
File "/home/mdiaz/Internship/mrcnn/maskrcnn-benchmark/maskrcnn_benchmark/modeling/backbone/backbone.py", line 7, in <module>
from maskrcnn_benchmark.modeling.make_layers import conv_with_kaiming_uniform
File "/home/mdiaz/Internship/mrcnn/maskrcnn-benchmark/maskrcnn_benchmark/modeling/make_layers.py", line 10, in <module>
from maskrcnn_benchmark.layers import Conv2d
File "/home/mdiaz/Internship/mrcnn/maskrcnn-benchmark/maskrcnn_benchmark/layers/__init__.py", line 10, in <module>
from .nms import nms
File "/home/mdiaz/Internship/mrcnn/maskrcnn-benchmark/maskrcnn_benchmark/layers/nms.py", line 3, in <module>
from maskrcnn_benchmark import _C
ImportError: /home/mdiaz/Internship/mrcnn/maskrcnn-benchmark/maskrcnn_benchmark/_C.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN3c105ErrorC1ENS_14SourceLocationERKSs
Does anyone know why this ImportError: /home/mdiaz/Internship/mrcnn/maskrcnn-benchmark/maskrcnn_benchmark/_C.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN3c105ErrorC1ENS_14SourceLocationERKSs
comes up if I am using CUDA 9.2 with gcc 6.4?
I will delete the whole repo and try to recompile.
I've deleted the repo, re download it and recompiled and still the same ImportError
.
According to the posts on #226, I tried to first import torch
but nothing (output in console):
The thing is that I've been using CPU-compiled maskrcnn_benchmark
for a while now and have not had any problem at all. I can use it without a problem (output in Spyder):
And I know there isn't any problem with PyTorch and CUDA 9.2 because I've been able to run other models and have not had any problem.
Does anyone know why this might be?
Using this answer: https://github.com/facebookresearch/maskrcnn-benchmark/issues/223#issuecomment-497991708, I was able to compile with this config:
V10.0.130
1.1.0
0.2.2
@manueldiaz96 did you mean to link to #223?
@cinjon no, I mean the comment where @futureisatyourhand describes the packages used. It was thanks to the information in this comment I was able to compile.
Sorry for the trouble. We need to make the the pre-compiled torchvision binaries do not downgrade pytorch nightly
❓ Questions and Help
Currently I am trying to install maskrcnn-benchmark on my space in a server where I want to do some tests. I was able to compile the repo on my machine using the CPU and everything went ok.
First I checked if python could see CUDA with
python -c "import torch;from torch.utils.cpp_extension import CUDA_HOME; print(CUDA_HOME); print(torch.cuda.is_available())"
and got:So I know it is been seen by python. but when I try to compile it with
python setup.py build develop
this is the output:The server has gcc version
6.3.0 20170516
. Also, I have been able to run on this same server, Pytorch training and inference scripts that use the GPUs. So I think it might be to do something with maskrcnn_benchmark.Thanks!