Closed geekvc closed 8 years ago
I did encounter this issue before either. It seems that there is problem with your GPU architecture and driver. I found a similar problem report in the following website:
https://bitbucket.org/rodrigob/doppia/issues/85/opencv-error-gpu-api-call-invalid-device
Perhaps, you could try this.
thank you very much! I tried the code and got this:
CUDA Device Query...
There are 4 CUDA devices.
CUDA Device #0
Major revision number: 3
Minor revision number: 5
Name: Tesla K40c
Total global memory: 4294770688
Total shared memory per block: 49152
Total registers per block: 65536
Warp size: 32
Maximum memory pitch: 2147483647
Maximum threads per block: 1024
Maximum dimension 0 of block: 1024
Maximum dimension 1 of block: 1024
Maximum dimension 2 of block: 64
Maximum dimension 0 of grid: 2147483647
Maximum dimension 1 of grid: 65535
Maximum dimension 2 of grid: 65535
Clock rate: 875500
Total constant memory: 65536
Texture alignment: 512
Concurrent copy and execution: Yes
Number of multiprocessors: 15
Kernel execution timeout: No
CUDA Device #1
Major revision number: 3
Minor revision number: 5
Name: Tesla K40c
Total global memory: 4294770688
Total shared memory per block: 49152
Total registers per block: 65536
Warp size: 32
Maximum memory pitch: 2147483647
Maximum threads per block: 1024
Maximum dimension 0 of block: 1024
Maximum dimension 1 of block: 1024
Maximum dimension 2 of block: 64
Maximum dimension 0 of grid: 2147483647
Maximum dimension 1 of grid: 65535
Maximum dimension 2 of grid: 65535
Clock rate: 875500
Total constant memory: 65536
Texture alignment: 512
Concurrent copy and execution: Yes
Number of multiprocessors: 15
Kernel execution timeout: No
CUDA Device #2
Major revision number: 3
Minor revision number: 5
Name: Tesla K40c
Total global memory: 4294770688
Total shared memory per block: 49152
Total registers per block: 65536
Warp size: 32
Maximum memory pitch: 2147483647
Maximum threads per block: 1024
Maximum dimension 0 of block: 1024
Maximum dimension 1 of block: 1024
Maximum dimension 2 of block: 64
Maximum dimension 0 of grid: 2147483647
Maximum dimension 1 of grid: 65535
Maximum dimension 2 of grid: 65535
Clock rate: 875500
Total constant memory: 65536
Texture alignment: 512
Concurrent copy and execution: Yes
Number of multiprocessors: 15
Kernel execution timeout: No
CUDA Device #3
Major revision number: 3
Minor revision number: 5
Name: Tesla K40c
Total global memory: 4294770688
Total shared memory per block: 49152
Total registers per block: 65536
Warp size: 32
Maximum memory pitch: 2147483647
Maximum threads per block: 1024
Maximum dimension 0 of block: 1024
Maximum dimension 1 of block: 1024
Maximum dimension 2 of block: 64
Maximum dimension 0 of grid: 2147483647
Maximum dimension 1 of grid: 65535
Maximum dimension 2 of grid: 65535
Clock rate: 875500
Total constant memory: 65536
Texture alignment: 512
Concurrent copy and execution: Yes
Number of multiprocessors: 15
Kernel execution timeout: No
and set at the CMakeLists.txt: set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} -arch compute_35 -code sm_35) and make clean make error still the same as above.
Maybe you could find other solutions on Google. I did not encounter this problem before.
Thank you all the same! I am trying it on other type GPU.
On the Tesla K20, the error disappeared. Thank you.
I have solved this problem by reinstalling cuda with xxx.deb file.
@KnightOfTheMoonlight I also meet the same issue. OpenCV Error: Gpu API call (invalid device function) in call, file /home/fzy/install/opencv-2.4.10/modules/gpu/include/opencv2/gpu/device/detail/transform_detail.hpp, line 361 terminate called after throwing an instance of 'cv::Exception' what(): /home/fzy/install/opencv-2.4.10/modules/gpu/include/opencv2/gpu/device/detail/transform_detail.hpp:361: error: (-217) invalid device function in function call So how to solve the problem without changing the GPU. I use CUDA 8.0 + opencv 2.4.10
I figure this problem like @geekvc saying. Add this line to CMakeList.txt file set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} -arch compute_52 -code sm_52) because my compute arch is 52. then delete the make files by runing rm -r ./build to ensure no cmake cache file exist ( @geekvc didn't work, maybe he don't delete all cmake cache file) make sudo make install then it works!
I encounter the similar problem.
OpenCV Error: Gpu API call (unknown error) in mallocPitch, file /data1/temporal-segment-networks-master/3rd-party/opencv-2.4.13/modules/dynamicuda/include/opencv2/dynamicuda/dynamicuda.hpp, line 1134 terminate called after throwing an instance of 'cv::Exception' what(): /data1/temporal-segment-networks-master/3rd-party/opencv-2.4.13/modules/dynamicuda/include/opencv2/dynamicuda/dynamicuda.hpp:1134: error: (-217) unknown error in function mallocPitch
Though I delete the build folder and add the line
set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} -arch compute_37 -code sm_37)
to CMakeList.txt file, the problem is still there.
Does anyone solve this without reinstalling cuda?
@KnightOfTheMoonlight what deb file do you use ? could you describe more details ? Thank you very much.
I use the tool to get test image from my test.avi, and follow the usage
./denseFlow_gpu -f test.avi -x tmp/flow_x -y tmp/flow_x -i tmp/image -b 20 -t 1 -d 0 -s 1
and get the error
OpenCV Error: Gpu API call (invalid device function) in call, file /home/uuz/Downloads/opencv/Install-OpenCV-master/Ubuntu/2.4/OpenCV/opencv-2.4.10/modules/gpu/include/opencv2/gpu/device/detail/transform_detail.hpp, line 361 terminate called after throwing an instance of 'cv::Exception' what(): /home/uuz/Downloads/opencv/Install-OpenCV-master/Ubuntu/2.4/OpenCV/opencv-2.4.10/modules/gpu/include/opencv2/gpu/device/detail/transform_detail.hpp:361: error: (-217) invalid device function in function call [1] 16872 abort (core dumped) ./denseFlow_gpu -f test.avi -x tmp/flow_x -y tmp/flow_x -i tmp/image -b 20 -t
after that I add
set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} -arch compute_35 -code sm_35)
to the CmakeLists.txt and make again, the error is the same as above, I do not know how to solve it. Thank you in advance!
I have met the same program, how to solve it my device is:GeForce GTX 1080 Ti/PCIe/SSE2 cuda 9.0
I code in makeList "set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} -arch compute_61 -code sm_61)"
@pengxiaoxiao 请问您解决了么,我现在也遇到了这个问题阿,cuda9版本
重装cuda8
发自我的iPhone
------------------ Original ------------------ From: sucaohan notifications@github.com Date: Fri,Jun 7,2019 1:03 PM To: wanglimin/dense_flow dense_flow@noreply.github.com Cc: shawxiao 2804597917@qq.com, Mention mention@noreply.github.com Subject: Re: [wanglimin/dense_flow] Gpu API call (invalid device function) in call (#6)
@pengxiaoxiao 但是一请问个ubuntu系统可以装两个cuda么,之前电脑装了很多东西,cuda9不让卸载
不行吧!
发自我的iPhone
------------------ Original ------------------ From: sucaohan notifications@github.com Date: Fri,Jun 7,2019 1:06 PM To: wanglimin/dense_flow dense_flow@noreply.github.com Cc: shawxiao 2804597917@qq.com, Mention mention@noreply.github.com Subject: Re: [wanglimin/dense_flow] Gpu API call (invalid device function) in call (#6)
@pengxiaoxiao 但是一请问个ubuntu系统可以装两个cuda么,之前电脑装了很多东西,cuda9不让卸载
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
@pengxiaoxiao 好的呢,谢谢
计算力不匹配的问题 未设置前opencv在cmake的输出是这样的:计算力是30 35 37 NVIDIA CUDA Use CUFFT: YES Use CUBLAS: YES USE NVCUVID:NO NVIDIA GPU arch: 30 35 37 NVIDIA PTX archs: Use fast math:NO
如果设置成功,cmake界面会有如下显示(我的显卡是1080ti): NVIDIA CUDA Use CUFFT: YES Use CUBLAS: YES USE NVCUVID:NO NVIDIA GPU arch: 61 NVIDIA PTX archs:61 Use fast math:NO
GPU arch/PTX archs都被设置为6.1 但如果运气不佳,添加编译选项并不能解决问题。 这时候需要修改opencv中关于CUDA计算能力这部分的配置文件./cmake/OpenCVDetectCUDA.cmake。 在 set(CUDA_ARCH_BIN ${cuda_arch_bin} CACHE STRING "Specify 'real' GPU architectures to build binaries for, BIN(PTX) format is supported") set(CUDA_ARCH_PTX ${cuda_arch_ptx} CACHE STRING "Specify 'virtual' PTX architectures to build PTX intermediate code for") 之前添加 set(cuda_arch_bin "6.1") set(cuda_arch_ptx "6.1") 保存后cmake上面那一段,重新将opencv cmake make make install一遍出现正确的计算能力显示61
最后重新编译dense_flow
I use the tool to get test image from my test.avi, and follow the usage
and get the error
after that I add
to the CmakeLists.txt and make again, the error is the same as above, I do not know how to solve it. Thank you in advance!