Closed linjing7 closed 1 year ago
I guess you cloned the repo without the flay --recursive
, hence the submodule xgutils is not cloned.
You can fix this by running git submodule update --init --recursive
at the repo root.
Hi, I clone the repo with the flag --recursive
, but the error still exists. It's okay when I train with one GPU, so I think it may not be caused by the reason that the submodule xgutils is not cloned. Have you tried multi-gpu training with this code?
Besides, do we need to train 300 epoch as default?
Did the code work on multi-GPU? How did the error ModuleNotFoundError: No module named 'xgutils' go away. Even after --recurcive, it still persists for me.
If anyone could solve this, can you please help me.
So first make sure xgutils is not empty. Then make sure when you run the command you are in the root of the ShapeFormer folder. Just add an argument of '--gpu 0 1 2 4' will achieve the multi-GPU training.
Yes, the command is run from Root of the Shapeformer repository and xgutils is not empty. It still gives error as xgutils not found.
And my GPU numbers are 0 1 2 3. Do I still need to add '--gpu 0 1 2 4'. Or '--gpu 0 1 2 3' ?
Hi, thanks for your excellent work. I successfully train VQDIF-16 with 1 GPU. However, the training speed is so slow (30 epoch/day), that means we need ten days to train 300 epochs. So I try the multi-gpu training, but when I use multi-gpu training, the error
No module named xgutils
occurs. Do you have any idea about this issue? BTW, I notice that the pretrainedvqdif
you provide is 31 epoch, but the defaultmax_epoch
is 300. So do we need to train 300 epoch?