lsongx / nerfplayer-nerfstudio

45 stars 5 forks source link

why my command Line is stay at " nerfstudio field components: CUDA set up, loading (should be quick)" #5

Closed xiazhi1 closed 1 year ago

xiazhi1 commented 1 year ago

I train the nerfplayer with example: ns-train nerfplayer-ngp --data dycheck/teddy/, and it keeps stay this status for a long time nearly an hour, is it normal? My environment is like below: nerfplayer 0.0.1 /root/autodl-tmp/nerfplayer-nerfstudio nerfstudio 0.3.2 /root/autodl-tmp/nerfstudio torch 2.0.1+cu117 torch-fidelity 0.3.0 torchaudio 2.0.2 torchmetrics 1.0.1 torchtyping 0.1.4 torchvision 0.15.2+cu117 tornado 6.3.2 tinycudann 1.7 Ubuntu 20.04 python 3.8

Looking forward to your reply @lsongx

lsongx commented 1 year ago

Hi @xiazhi1 , do you mind posting what is happening during waiting (perhaps by htop)? Also, does it work after waiting for a long time? It may take a long time to compile the cuda source, depending on your hardwares, but an hour sounds indeed long.

xiazhi1 commented 1 year ago

Hi @xiazhi1 , do you mind posting what is happening during waiting (perhaps by htop)? Also, does it work after waiting for a long time? It may take a long time to compile the cuda source, depending on your hardwares, but an hour sounds indeed long.

These photos shows my cpu and gpu 's usage, it seems like the train is not use GPU to train? Why is this? e72f35d1a95922a46da48d70c272993 5fecae13244bb0a8f6a991c24b12b0c

xiazhi1 commented 1 year ago

@lsongx

lsongx commented 1 year ago

That's weird. Both GPU and CPU are not being used. 😕

Friedrich-M commented 1 year ago

I met the same problem. Both GPU and CPU are not being used.

image
Friedrich-M commented 1 year ago

@xiazhi1 I solve the problem by removing the cache of the build directory because pytorch jit might stuck if the build directory exists.

You can use the command as followed

rm -rf ~/.cache/torch_extensions/py38_cu118/nerfstudio_field_components_cuda