lucidrains / voicebox-pytorch

Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch
MIT License
589 stars 49 forks source link

The usage code throws exception #4

Closed yzmyyff closed 1 year ago

yzmyyff commented 1 year ago

the error log is as follows

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/call.py", line 42, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 778, in _test_impl
    results = self._run(model, ckpt_path=ckpt_path)
  File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 973, in _run
    results = self._run_stage()
  File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 1009, in _run_stage
    return self._evaluation_loop.run()
  File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/loops/utilities.py", line 177, in _decorator
    return loop_run(self, *args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/loops/evaluation_loop.py", line 115, in run
    self._evaluation_step(batch, batch_idx, dataloader_idx)
  File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/loops/evaluation_loop.py", line 375, in _evaluation_step
    output = call._call_strategy_hook(trainer, hook_name, *step_kwargs.values())
  File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/call.py", line 291, in _call_strategy_hook
    output = fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/strategies/strategy.py", line 388, in test_step
    return self.model.test_step(*args, **kwargs)
  File "/data/vjuicefs_speech_tts_v3/11146693/remote_map/voicebox/voicebox/lightning_module.py", line 295, in test_step
    sampled = self._cfm_wrapper.sample(phoneme_ids=phonemes, cond=x,
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/voicebox_pytorch/voicebox_pytorch.py", line 623, in sample
    trajectory = odeint(fn, y0, t, **self.odeint_kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torchdiffeq/_impl/odeint.py", line 77, in odeint
    solution = solver.integrate(t)
  File "/usr/local/lib/python3.10/dist-packages/torchdiffeq/_impl/solvers.py", line 105, in integrate
    dy, f0 = self._step_func(self.func, t0, dt, t1, y0)
  File "/usr/local/lib/python3.10/dist-packages/torchdiffeq/_impl/fixed_grid.py", line 20, in _step_func
    y_mid = y0 + f0 * half_dt
RuntimeError: The size of tensor a (512) must match the size of tensor b (524288) at non-singleton dimension 2
yzmyyff commented 1 year ago

seems it's fixed in the latest commit