Closed ZonePG closed 3 years ago
ptxas error : Entry function '_ZN3cub30DeviceRadixSortDownsweepKernelINS_21DeviceRadixSortPolicyIiN6thrust5tupleI6float3S4_4int25Mat334Cov36float2fNS2_9null_typeES9_S9_EEiE9Policy700ELb0ELb0EiSA_iEEvPKT2_PSD_PKT3_PSH_PT4_SL_iiNS_13GridEvenShareISL_EE' uses too much shared data (0xc300 bytes, 0xc000 max) CMake Error at sfusion_generated_supersurfel_fusion.cu.o.Release.cmake:279 (message):
I have the same problem
What GPU are you using? And are you both on CUDA 11? It's weird because it should not happen with a recent GPU. I will try to see if I can install cuda 11 and reproduce the error. I will also edit the tracking part of the system because I think I use some old cuda instructions that may be outdated now.
@BruceCanovas hi,I used RTX2070 and I tried with cuda 11 and cuda10.2. Is this problem caused by GPU architecture?
I think so yes. Until now the system has been tested on an Nvidia GTX 950 M, a Jetson TX2 and Jetson Xavier as well as a Quadro P600. I am using a script cmake to check the compute capability of the GPU automatically and to set the right NVCC compilation flags. It may be obsolete for RTX GPUs but I don't see any reason why. Right now I don't have much idea about this error, I will investigate and let you know if I find something.
@BruceCanovas I am not very familiar with CUDA and I have just started to work with it. How do I set the NVCC compile flag and in which file do I set the compile flag
You can pass flags to the cuda compiler NVCC in the CMakeLists.txt file.
@hanxiumeng I have been able to build and run the code using an Nvidia GTX 1660 TI with cuda 10.2 as well. Maybe you can try to build the code specifying the correct architecture for your GPU in the CMakeLists.txt.
I am not sure but I think the problem is at the line 485 of supersurfel_fusion.cu, where I am making a tuple of all the thrust vectors of the model to sort them all in once. The tuple seems to be too big (however not for my 950 GTX GPU which is well below yours, so that's weird). One workaround may be to do a sort for each vector of the model separately, or to group them in smaller groups. It might slower a bit the code though.
@BruceCanovas Thanks for your work. I'll give it a try. And I'll give you feedback later
@BruceCanovas hi,I managed to compile by setting the NVCC parameter with cuda 10.2, but I ran into a new problem. I'll put it on another page
@BruceCanovas hi,I managed to compile by setting the NVCC parameter with cuda 10.2, but I ran into a new problem. I'll put it on another page
I came across the same problem and wondered if you had solved it,I'm using a 3060 graphics card with CUDa11.2 and OpencV4.4.0 installed
ptxas error : Entry function uses too much shared data
this error was presented when I run catkin_make -DOpenCV_DIR=<---------my opencv dir---->
with cuda11.0 and cudnn 8.0