Closed dillfrescott closed 2 years ago
Ok after passing CUDA_LAUNCH_BLOCKING
it says:
Validation sanity check: 0% 0/1 [00:00<?, ?batch/s]Traceback (most recent call last):
File "run.py", line 15, in <module>
run_task()
File "run.py", line 11, in run_task
task_cls.start()
File "/content/diff-svc/training/task/base_task.py", line 234, in start
trainer.fit(task)
File "/content/diff-svc/utils/pl_utils.py", line 495, in fit
self.run_pretrain_routine(model)
File "/content/diff-svc/utils/pl_utils.py", line 571, in run_pretrain_routine
self.evaluate(model, self.get_val_dataloaders(), self.num_sanity_val_steps, self.testing)
File "/content/diff-svc/utils/pl_utils.py", line 1196, in evaluate
test)
File "/content/diff-svc/utils/pl_utils.py", line 1316, in evaluation_forward
output = model.validation_step(*args)
File "/content/diff-svc/training/task/SVC_task.py", line 139, in validation_step
outputs['losses'], model_out = self.run_model(self.model, sample, return_output=True, infer=False)
File "/content/diff-svc/training/task/SVC_task.py", line 94, in run_model
ref_mels=target, f0=f0, uv=uv, energy=energy, infer=infer)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/content/diff-svc/network/diff/diffusion.py", line 233, in forward
skip_decoder=True, infer=infer, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/content/diff-svc/modules/fastspeech/fs2.py", line 132, in forward
decoder_inp_origin = decoder_inp = torch.gather(decoder_inp, 1, mel2ph_) # [B, T, H]
RuntimeError: index 6001 is out of bounds for dimension 1 with size 6001
Ok umm.
Setting max_input_tokens:
to 60000
seems to have fixed it? I think? Its no longer getting stuck there at least...
Thank you for helping so far. Any idea what this next error means?