Closed jreus closed 2 years ago
It is not about the trainer. It is in 🐸TTS. Can you try the latest TTS version?
@WeberJulian can you take a look at this?
Using your recipe, I couldn't reproduce the error you show here, training is running fine. I'm using TTS v0.7.1 btw it seems like v0.8.0 breaks something else, investigating. Try installing TTS from scratch in a new env if checking out to v0.7.1 doesn't solve the issue for you.
--> STEP: 0/18 -- GLOBAL_STEP: 0
| > decoder_loss: 5.80364 (5.80364)
| > postnet_loss: 7.56972 (7.56972)
| > capaciton_reconstruction_loss: 26195.39648 (26195.39648)
| > capacitron_vae_loss: -0.01047 (-0.01047)
| > capacitron_vae_beta_loss: 145.79236 (145.79236)
| > capacitron_vae_kl_term: 4.20764 (4.20764)
| > capacitron_beta: 1.00000 (1.00000)
| > stopnet_loss: 1.00547 (1.00547)
| > ga_loss: 0.00614 (0.00614)
| > decoder_diff_spec_loss: 0.44172 (0.44172)
| > postnet_diff_spec_loss: 3.98933 (3.98933)
| > decoder_ssim_loss: 0.83536 (0.83536)
| > postnet_ssim_loss: 0.83291 (0.83291)
| > loss: 20.49835 (20.49835)
| > align_error: 0.93405 (0.93405)
| > grad_norm: 0.00000 (0.00000)
| > current_lr: 0.00100
| > step_time: 1.20270 (1.20274)
| > loader_time: 0.92890 (0.92894)
Describe the bug
Hey all. I'm trying to train a Capacitron model at the moment and keep running into a device error
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
(stacktrace below). I'm basically following the structure of the training recipe verbatim (using a custom dataset formatter function).I post this issue here because I was under the impression that Trainer should be making sure all the tensors get put on the gpu. Does this issue look familiar to anyone?
To Reproduce
TrainCapacitron.py
Expected behavior
All tensors on the GPU
Logs
Additional context
No response