Open MauroPfister opened 4 years ago
I was able to solve the issue by replacing CUDA_NUM_THREADS = 1024
by CUDA_NUM_THREADS = 512
and recompiling:
https://github.com/open-mmlab/mmdetection/blob/2b6f6616f804beaca3dbf071fa398c586243db13/mmdet/ops/dcn/src/cuda/deform_conv_cuda_kernel.cu#L76
The regular convolutions of PyTorch do not seem to have this problem. Maybe the CUDA_NUM_THREADS
constant could be set depending for which architecture the DCNs are built?
Thanks for your reporting! It is a known issue that setting CUDA_NUM_THREADS
to 1024 causes the building failure on some old or lightweight devices. We have not found a good way to set it according to the gpu arch. PRs are welcome if you have any ideas.
I don't have any experience with PyTorch CUDA extensions, so I can't help with a PR unfortunately. But maybe just mention it in a README somewhere? That way people could easily fix the issue themselves.
Thanks for your reporting! I would add it to FAQ to help people locate problems faster.
Hi
I am trying to use the deformable convolutions from this repo on a Jetson TX2. Compilation was successful and I can also run them from Python. However, for every call of the DCN I get the following error:
error in deformable_im2col: too many resources requested for launch
I was wondering if there are any setting in the
.cu
files that I can change to fix this error?Minimal reproducible example
Environment
Since I only wanted to install DCNs instead of the whole repo, I used a reduced
setup.py
(copied from this repo):Bug fix After a quick search on Google I found this PyTorch issue which seems related. Unfortunately I have no experience with CUDA at all, so I am not sure if this helps.