Eromera / erfnet_pytorch

Pytorch code for semantic segmentation using ERFNet
Other
431 stars 125 forks source link

get stuck running erfnet model on Jetson TX2 #19

Closed EthanCalvin closed 5 years ago

EthanCalvin commented 6 years ago

Hi, I want to run the erfnet model on Jeson tx2.

but When I try to run erfnet code, I got stuck

"RuntimeError : cuda runtime error(7) : too many resources requested for launch at /home/nvidia/pytorch/aten/src/THCUNN/im2col.h"

please help me!.

ShreyasSkandan commented 6 years ago

Did you find a solution to this problem?

I've got the same problem with the exact same version of CUDA and CuDNN.

ShreyasSkandan commented 6 years ago

This problem has been fixed if you're still interested.

This has to do with CUDA 9.0 attempting to allocate more registers to each thread. This can be fixed by setting a launch bound on the cuda kernels in im2col.h. You should be able to just pull the latest pytorch version and re-install it and it would work.

For more details look at this thread: https://github.com/pytorch/pytorch/issues/7680

Eromera commented 6 years ago

Hi! Sorry for the late reply. I recently tried in Jetson TX2 with the latest CUDA and compiling PyTorch from scratch and it worked perfectly. So I'll close the issue and if you are still having trouble please reopen!

MrLinNing commented 5 years ago

@ShreyasSkandan Hi, I have compiling PyTorch from scrath, but I still meet the problem. Which file to add the __launch_bounds__(1024) ?

MrLinNing commented 5 years ago

@Eromera Can you give me your pytorch wheel on TX2 ?Or more details about installing the pytorch on TX2?I have try many times but the problem still exists. this

Eromera commented 5 years ago

Hi, I dunno about the launch_bounds issue but I followed this procedure. Had some issues I think but it ends up compiling successfully

MrLinNing commented 5 years ago

What's your version of Pytorch, CUDA and CUDNN on TX2? @Eromera

Eromera commented 5 years ago

Last pytorch from source, CUDA 9.0 and CuDNN 7