facebookresearch / seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation
Other
10.8k stars 1.05k forks source link

FineTune Errror TEXT_TO_SPEECH Same lang #424

Closed developeranalyser closed 5 months ago

developeranalyser commented 5 months ago

if srcLang and tgtLang be same Fine Tune now work why ?

2024-04-16 13:24:40,044 INFO -- seamless_communication.cli.m4t.finetune.trainer.11840: Start finetuning 2024-04-16 13:24:40,044 INFO -- seamless_communication.cli.m4t.finetune.trainer.11840: Run evaluation 0% 0/78 [00:09<?, ?it/s] Traceback (most recent call last): File "/content/my_seamless_communication/src/finetune.py", line 200, in main() File "/content/my_seamless_communication/src/finetune.py", line 196, in main finetune.run() File "/content/my_seamless_communication/src/seamless_communication/cli/m4t/finetune/trainer.py", line 386, in run self._eval_model() File "/content/my_seamless_communication/src/seamless_communication/cli/m4t/finetune/trainer.py", line 331, in _eval_model loss = self.calc_loss(batch, self.model(batch)) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, *kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/nn/parallel/distributed.py", line 1523, in forward else self._run_ddp_forward(inputs, kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/nn/parallel/distributed.py", line 1359, in _run_ddp_forward return self.module(*inputs, kwargs) # type: ignore[index] File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(args, kwargs) File "/content/my_seamless_communication/src/seamless_communication/cli/m4t/finetune/trainer.py", line 123, in forward assert batch.text_to_units.prev_output_tokens is not None AssertionError [2024-04-16 13:24:53,644] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 11840) of binary: /usr/bin/python3 Traceback (most recent call last): File "/usr/local/bin/torchrun", line 8, in sys.exit(main()) File "/usr/local/lib/python3.10/dist-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 347, in wrapper return f(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/distributed/run.py", line 812, in main run(args) File "/usr/local/lib/python3.10/dist-packages/torch/distributed/run.py", line 803, in run elastic_launch( File "/usr/local/lib/python3.10/dist-packages/torch/distributed/launcher/api.py", line 135, in call return launch_agent(self._config, self._entrypoint, list(args)) File "/usr/local/lib/python3.10/dist-packages/torch/distributed/launcher/api.py", line 268, in launch_agent raise ChildFailedError(

mhlakhani commented 4 months ago

I'm also hitting this - how did you solve this?

developeranalyser commented 3 months ago

easy

RRThivyan commented 1 week ago

Hi, can anyone tell me how to overcome this error? it cause due to the units in the fleur dataset is Null, ie. units are not extracted from the audio. do we have to manually extract it or something?