CVMI-Lab / PAConv

(CVPR 2021) PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds
Apache License 2.0
287 stars 40 forks source link

issue with classification #22

Closed rabbiahassan closed 3 years ago

rabbiahassan commented 3 years ago

Hello ! your work is very interesting.When I tried to put the classification model on training it doesn't show any error,but it does get stuck here and doesn't proceed forward.Please tell me what is this issue? Thanks for your time. image

mutianxu commented 3 years ago

Hi,

Since our cuda_kernel runs in a multi-thread parallel strategy, only 2 gpus may not be able to serve for the need of paralleled threads.

To solve this issue, you may try to run it on more gpus; or try smaller batch_size (this probably causes performance drop due to not the best batch_size setting, but need less thread).

rabbiahassan commented 3 years ago

Thanks for the response. I reduced the batch size to even minimum but still it doesn't work.I think this issue is not related to the batch size or heavy computation.I am attaching the memory status of gpu alongwith. I think it gets stuck somewhere but doesn't show any error. image

mutianxu commented 3 years ago

Ok, if this is the first time you run your classification code, please wait for some time (about 1-2minutes, depending on the hardware) for compiling the CUDA op.

Also, after you finish compiling, if it stucks again at loss.backward caused by the limited threads, please solve it by reducing the batch_size or using more gpus.

rabbiahassan commented 3 years ago

Thanks for your response again. I have reduced batch size to even 4 but still it doesnt work. I am attaching the gpu usage screenshot as well.I think it gets stuck even before,(because its not even using gpu to the full capacity). image

mutianxu commented 3 years ago

Does it keep stuck? Have you waited for more than 2 minutes?

rabbiahassan commented 3 years ago

Yes it does.I have waited for five hours.It just doesn't proceed an inch.

mutianxu commented 3 years ago

Ok, I have just run the code and the program runs normally with normal speed, while I use 4 3090Ti gpus or 4 2080Ti gpus under original batch_size.

As shown in your picture, I can make sure that the code is OK and you have compiled the cuda lib.

So as I mentioned before, this is caused by the very limited thread provided by your GPU (not only depended on the number of gpus but also the type of gpus).

What you can do now is to run on more gpus or better gpus to support our cuda_kernel.

meiqing0417 commented 2 years ago

Excuse me, I also encountered this problem. Is it solved now?

brunotecgraf commented 2 years ago

Excuse me, I also encountered this problem. Is it solved now?

For me using the pointnet option worked!