Open anatlin opened 6 years ago
Any updates on the issue ?
Any updates on the issue ?
Any updates on the issue ?
Please build caffe2 from source. Works on an AWS instance when caffe2 is built from source
Any updates on the issue ?
Any updates on the issue ?
System information Operating system: Ubuntu16.04 CUDA version: Cuda compilation tools, release 8.0, V8.0.44 cuDNN version: cudnn-8.0-linux-x64-v7 GPU models (for all devices if they are not all the same): Geforce 1060 PYTHONPATH environment variable: Anacodna2.7 caffe2 binary was installed using: conda install -c caffe2 caffe2-cuda8.0-cudnn7
Detectron$ python2 -c 'from caffe2.python import core' 2>/dev/null && echo "Success" || echo "Failure" Success Detectron$ python2 -c 'from caffe2.python import workspace; print(workspace.NumCudaDevices())' 1
export PATH=/usr/local/cuda-8.0/bin:$PATH echo $LD_LIBRARY_PATH /usr/local/cuda-8.0/lib64:/home/majid/softwares/cudnn/8.0-7.1/lib64 @rbgirshick I just experienced the same issue
when I run
python2 $DETECTRON/tests/test_spatial_narrow_as_op.py
I get the following error:
RuntimeError: [enforce fail at context_gpu.h:171] . Encountered CUDA error: invalid device function Error from operator:
input: "A" input: "B" input: "C_grad" output: "A_grad" name: "" type: "SpatialNarrowAsGradient" device_option { device_type: 1 cuda_gpu_id: 0 } is_gradient_op: true
after running
python2 tools/infer_simple.py --cfg configs/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml --output-dir /tmp/detectron-visualizations --image-ext jpg --wts https://s3-us-west-2.amazonaws.com/detectron/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train:coco_2014_valminusminival/generalized_rcnn/model_final.pkl demo
python2 tools/train_net.py --cfg configs/getting_started/tutorial_1gpu_e2e_faster_rcnn_R-50-FPN.yaml OUTPUT_DIR /tmp/detectron-output
I get the following error:
RuntimeError: [enforce fail at context_gpu.h:155] . Encountered CUDA error: invalid device function Error from operator:
input: "gpu_0/conv1" input: "gpu_0/res_conv1_bn_s" input: "gpu_0/res_conv1_bn_b" output: "gpu_0/conv1" name: "" type: "AffineChannel" device_option { device_type: 1 cuda_gpu_id: 0 }
I have also gotten around this error by building Caffe2 from source
@mfe7 , I was able to compile caffe2 from source after a lot of desperate try. Basically, the solution was not that complicated. I was using virtualenv and I was also compiling everything locally. When I installed every package including caffe2 with sudo permission in ubuntu. I worked like a charm and I was able to train with my own custom dataset with amazing results. Currently I am trying to compile it in another machine in which I have no sudo permission. If I can manage that, I will try to post an update here. I am preparing a bash file which you can run easily if you have sudo permission.
I compiled caffe2 from source without sudo, and error disappeared. https://caffe2.ai/docs/getting-started.html?platform=ubuntu&configuration=compile
I was encountering a variant of this issue when using the unsupported python 3 fork. However, I found that I didn't have to install caffe2 from source, just install an older version:
conda install -c caffe2 caffe2-cuda8.0-cudnn7=0.8.dev=py36_2018.05.14
hope this helps someone 😄
same error
RuntimeError: [enforce fail at context_gpu.h:181] . Encountered CUDA error: invalid device functionError from operator:
input: "A" input: "B" output: "C" name: "" type: "SpatialNarrowAs" device_option { device_type: 1 cuda_gpu_id: 0 }
but i use python2.7, so install old version and problem solved.
conda remove caffe2-cuda8.0-cudnn7
conda install -c caffe2 caffe2-cuda8.0-cudnn7=0.8.dev=py27_2018.05.14
thanks for @rowanz 's reply
Expected results
This test to pass.
Actual results
Detailed steps to reproduce
System information
PYTHONPATH
environment variable: ?python --version
output: Python 2.7.14 :: Anaconda, Inc.