jzbontar / mc-cnn

Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches
BSD 2-Clause "Simplified" License
707 stars 232 forks source link

Invalid device function #47

Open ranjeethks opened 6 years ago

ranjeethks commented 6 years ago

Hi I use Torch7, OpenCV3, png++ (0.2.9) and libpng1.6 on an Ubuntu 16.04

I get following error when I run ./main.lua kitti fast -a predict -net_fname net/net_kittifast-a_train_all.t7 -left samples/input/kittiL.png -right samples/input/kittiR.png -disp_max 70

kitti fast -a predict -net_fname net/net_kittifast-a_train_all.t7 -left samples/input/kittiL.png -right samples/input/kittiR.png -disp_max 70 luajit: /home/ubuntu/mc-cnn/Normalize2.lua:11: invalid device function stack traceback: [C]: in function 'Normalize_forward' /home/ubuntu/mc-cnn/Normalize2.lua:11: in function 'updateOutput' ./main.lua:911: in function 'forward_free' ./main.lua:945: in function 'stereo_predict' ./main.lua:1101: in main chunk [C]: at 0x00405d50

What could be the issue

LUCASLLA commented 6 years ago

hi, I'm with the same problem, have you found a solution? tks.

hassanisaadi commented 6 years ago

any luck on this issue? I have the same.

LUCASLLA commented 6 years ago

Not yet. I'm trying to reproduce some results of the Kitti Vision Benchmark (http://www.cvlibs.net/datasets/kitti/eval_scene_flow.php?benchmark=stereo) and I'm having this same 'invalid device function' error in two methods: MC-CNN-acrt (the same error of ranjeethks) and in L-ResMatch.

In L-ResMatch when I run:

scripts/preprocess_kitti.lua -color rgb -storage storage

I have the error:

luajit: scripts/preprocess_kitti.lua:113: invalid device function stack traceback: [C]: in function 'remove_nonvisible' scripts/preprocess_kitti.lua:113: in main chunk [C]: at 0x00405d50

hassanisaadi commented 6 years ago

I think my problem is memory and/or version. How should we know about which CUDA/Torch version he used in this code?

hassanisaadi commented 6 years ago

I solved my problem. Actually, since I use a cluster, I was not submitting my job to the cluster. That's why I got this problem.

LUCASLLA commented 6 years ago

I solved my problem too.

The problem was that I had not set the correct CUDA Compute Capability according to my GPU on the Makefiles of both projects.

My GPU is a Quadro K1100M, that has CUDA Compute Capability 3.0. So, I had to change on the Makefiles of my projects the parameter sm_35 to sm_30 (sm_35 means cuda compute capability 3.5 and so on).

A table of Cuda Compute Capability of the GPUs can be found here: https://developer.nvidia.com/cuda-gpus

Just to be more precise, I changed the following lines on the Makefile of the projects:

---------- In L-ResMatch project:

libadcensus.so: src/adcensus.cu $(CUDA)/bin/nvcc -arch sm_35 -O3 -DNDEBUG --compiler-options '-fPIC' -o libadcensus.so --shared src/adcensus.cu $(CFLAGS) $(LDFLAGS_NVCC)

libcuresmatch.so: src/curesmatch.cu $(CUDA)/bin/nvcc -arch sm_35 -O3 -DNDEBUG --compiler-options '-fPIC' -o libcuresmatch.so --shared src/curesmatch.cu $(CFLAGS) $(LDFLAGS_NVCC)

---------- In MC-CNN-acrt project:

libadcensus.so: adcensus.cu SpatialLogSoftMax.cu nvcc -arch sm_35 -O3 -DNDEBUG --compiler-options '-fPIC' -o libadcensus.so --shared adcensus.cu $(CFLAGS) $(LDFLAGS_NVCC)

Just change the sm_xx to the correct one based on your videocard. After that, it worked perfectly. (I used Cuda 8 to reproduce the code) =]