fishaudio / fish-speech

Brand new TTS solution
https://speech.fish.audio
Other
13.15k stars 981 forks source link

Error occured while finetuning fish-speech-1.4: index 5 is out of bounds for dimension 1 with size 5 #521

Closed wjddd closed 4 weeks ago

wjddd commented 4 weeks ago

Self Checks

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

Docker version: lengyue233/fish-speech:v1.2.1

I followed instructions on: https://speech.fish.audio/zh/finetune/#4-lora

python fish_speech/train.py --config-name text2semantic_finetune \
    project=$project \
    +lora@model.model.lora_config=r_8_alpha_16

and error occured:

Traceback (most recent call last):
  File "/home/tts/fishspeech/../../ref/fishspeech/fish_speech/utils/utils.py", line 66, in wrap
    metric_dict, object_dict = task_func(cfg=cfg)
  File "/home/tts/fishspeech/fish_speech/train.py", line 113, in train
    trainer.fit(model=model, datamodule=datamodule, ckpt_path=ckpt_path)
  File "/usr/local/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py", line 538, in fit
    call._call_and_handle_interrupt(
  File "/usr/local/lib/python3.10/site-packages/lightning/pytorch/trainer/call.py", line 46, in _call_and_handle_interrupt
    return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/lightning/pytorch/strategies/launchers/subprocess_script.py", line 105, in launch
    return function(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py", line 574, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/usr/local/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py", line 981, in _run
    results = self._run_stage()
  File "/usr/local/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py", line 1023, in _run_stage
    self._run_sanity_check()
  File "/usr/local/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py", line 1052, in _run_sanity_check
    val_loop.run()
  File "/usr/local/lib/python3.10/site-packages/lightning/pytorch/loops/utilities.py", line 178, in _decorator
    return loop_run(self, *args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/lightning/pytorch/loops/evaluation_loop.py", line 135, in run
    self._evaluation_step(batch, batch_idx, dataloader_idx, dataloader_iter)
  File "/usr/local/lib/python3.10/site-packages/lightning/pytorch/loops/evaluation_loop.py", line 396, in _evaluation_step
    output = call._call_strategy_hook(trainer, hook_name, *step_args)
  File "/usr/local/lib/python3.10/site-packages/lightning/pytorch/trainer/call.py", line 319, in _call_strategy_hook
    output = fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/lightning/pytorch/strategies/strategy.py", line 410, in validation_step
    return self._forward_redirection(self.model, self.lightning_module, "validation_step", *args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/lightning/pytorch/strategies/strategy.py", line 640, in __call__
    wrapper_output = wrapper_module(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 1636, in forward
    else self._run_ddp_forward(*inputs, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 1454, in _run_ddp_forward
    return self.module(*inputs, **kwargs)  # type: ignore[index]
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/lightning/pytorch/strategies/strategy.py", line 633, in wrapped_forward
    out = method(*_args, **_kwargs)
  File "/home/tts/fishspeech/fish_speech/models/text2semantic/lit_module.py", line 202, in validation_step
    return self._step(batch, batch_idx, "val")
  File "/home/tts/fishspeech/fish_speech/models/text2semantic/lit_module.py", line 118, in _step
    outputs = self.model(
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/tts/fishspeech/fish_speech/models/text2semantic/llama.py", line 513, in forward
    parent_result = super().forward(inp, key_padding_mask)
  File "/home/tts/fishspeech/fish_speech/models/text2semantic/llama.py", line 237, in forward
    x = self.embed(inp)
  File "/home/tts/fishspeech/fish_speech/models/text2semantic/llama.py", line 220, in embed
    emb = self.codebook_embeddings(x[:, i + 1] + i * self.config.codebook_size)
IndexError: index 5 is out of bounds for dimension 1 with size 5

✔️ Expected Behavior

No response

❌ Actual Behavior

No response

Stardust-minus commented 4 weeks ago

It seems that the codebook has out of bounds. Is it because the VQ features were not regenerated from the previous data?

wjddd commented 4 weeks ago

It seems that the codebook has out of bounds. Is it because the VQ features were not regenerated from the previous data?

Hi, maybe it's because the dataset is too small. Now I use a new dataset with more than 100 audio files, and it works.