report error when I use multiple GPUs

v-iashin / SpecVQGAN

Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)

https://v-iashin.github.io/SpecVQGAN

MIT License

347 stars 40 forks source link

report error when I use multiple GPUs #10

Open yangdongchao opened 2 years ago

yangdongchao commented 2 years ago

python3 train.py --base vas_codebook.yaml -t True --gpus 0,1,

when I try to run the code with two GPUs, it report error pytorch_lightning.utilities.exceptions.MisconfigurationException: You requested GPUs: [0, 1] But your machine only has: []

But if I only uses gpus 0, there is no error happen. So I want to ask how to using multiple GPUs to train this code?

v-iashin commented 2 years ago

Well, I don't know why is it happening on your side, I am afraid. Are you using windows?

If I were you, I would check if you can train a model (not SpecVQGAN) in a distributed setting using pure pytorch.

If you can train one, I would look into PyTorchLightning. It seems to miss one of your GPUs.

Also, could you please share the output of nvidia-smi and torch.cuda.device_count()

yangdongchao commented 2 years ago

Well, I don't know why is it happening on your side, I am afraid. Are you using windows?

If I were you, I would check if you can train a model (not SpecVQGAN) in a distributed setting using pure pytorch.

If you can train one, I would look into PyTorchLightning. It seems to miss one of your GPUs.

Also, could you please share the output of nvidia-smi and torch.cuda.device_count()

Thanks for your reply, I have solve this problem.

v-iashin commented 2 years ago

How did you solve it?

yangdongchao commented 2 years ago

How did you solve it?

How did you solve it?

In fact, I did anything. When I run codebook, it still only use one GPU. But when I train transformer, it can use multiple GPUs. So I give up to use multiple GPUs when training codebook.