I just started training using latest torch version (2.0.1). I am seeing entires like below printed a lot. Is the setup OK or am I missing something?
Also, I am using my custom phonemized dataset for training. Can you give some indication in terms of number of steps when something legible sounding output can be heard on the tensorboard eval audio page?
min value is tensor(-1.3583, device='cuda:0', grad_fn=<MinBackward1>)
max value is tensor(1.4576, device='cuda:0', grad_fn=<MaxBackward1>)
min value is tensor(-1.3833, device='cuda:0', grad_fn=<MinBackward1>)
max value is tensor(1.2478, device='cuda:0', grad_fn=<MaxBackward1>)
min value is tensor(-1.3127, device='cuda:0', grad_fn=<MinBackward1>)
max value is tensor(1.2403, device='cuda:0', grad_fn=<MaxBackward1>)
min value is tensor(-1.2534, device='cuda:0', grad_fn=<MinBackward1>)
max value is tensor(1.1441, device='cuda:0', grad_fn=<MaxBackward1>)
min value is tensor(-1.3840, device='cuda:0', grad_fn=<MinBackward1>)
max value is tensor(1.5967, device='cuda:0', grad_fn=<MaxBackward1>)
I just started training using latest torch version (2.0.1). I am seeing entires like below printed a lot. Is the setup OK or am I missing something?
Also, I am using my custom phonemized dataset for training. Can you give some indication in terms of number of steps when something legible sounding output can be heard on the tensorboard eval audio page?