SWivid / F5-TTS

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
https://arxiv.org/abs/2410.06885
MIT License
6.17k stars 688 forks source link

Train error #245

Open oztrkoguz opened 1 week ago

oztrkoguz commented 1 week ago

I prepared 300 hours of Turkish data, I start training with a100 80 GB video card, the training starts and then gives an error

The following values were not passed to `accelerate launch` and had defaults used instead:
        `--num_processes` was set to a value of `1`
        `--num_machines` was set to a value of `1`
        `--mixed_precision` was set to a value of `'no'`
        `--dynamo_backend` was set to a value of `'no'`
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
Using tokenizer path: data/your_dataset_name_tokenizer/vocab.txt
Using logger: None
Loading dataset ...
Sorting with sampler... if slow, check whether dataset is provided with duration: 100%|█| 32/32 [00:00<00:00, 717741.86i
Creating dynamic batches with 38400 audio frames per gpu: 100%|████████████████████| 32/32 [00:00<00:00, 1032444.06it/s]
Epoch 1/11:   0%|                                                                               | 0/1 [06:41<?, ?step/s]
Traceback (most recent call last):
  File "/home/akilliceviribilisim/Oguz/F5-TTS/train.py", line 100, in <module>
    main()
  File "/home/akilliceviribilisim/Oguz/F5-TTS/train.py", line 94, in main
    trainer.train(train_dataset,
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/akilliceviribilisim/Oguz/F5-TTS/model/trainer.py", line 259, in train
    for batch in progress_bar:
                 ^^^^^^^^^^^^
  File "/home/akilliceviribilisim/Env/tts/lib/python3.12/site-packages/tqdm/std.py", line 1181, in __iter__
    for obj in iterable:
               ^^^^^^^^
  File "/home/akilliceviribilisim/Env/tts/lib/python3.12/site-packages/accelerate/data_loader.py", line 550, in __iter__
    current_batch = next(dataloader_iter)
                    ^^^^^^^^^^^^^^^^^^^^^
  File "/home/akilliceviribilisim/Env/tts/lib/python3.12/site-packages/torch/utils/data/dataloader.py", line 631, in __next__
    data = self._next_data()
           ^^^^^^^^^^^^^^^^^
  File "/home/akilliceviribilisim/Env/tts/lib/python3.12/site-packages/torch/utils/data/dataloader.py", line 1346, in _next_data
    return self._process_data(data)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/akilliceviribilisim/Env/tts/lib/python3.12/site-packages/torch/utils/data/dataloader.py", line 1372, in _process_data
    data.reraise()
  File "/home/akilliceviribilisim/Env/tts/lib/python3.12/site-packages/torch/_utils.py", line 705, in reraise
    raise exception
RecursionError: Caught RecursionError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/akilliceviribilisim/Env/tts/lib/python3.12/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
    data = fetcher.fetch(index)  # type: ignore[possibly-undefined]
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/akilliceviribilisim/Env/tts/lib/python3.12/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
            ~~~~~~~~~~~~^^^^^
  File "/home/akilliceviribilisim/Oguz/F5-TTS/model/dataset.py", line 110, in __getitem__
    return self.__getitem__((index + 1) % len(self.data))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/akilliceviribilisim/Oguz/F5-TTS/model/dataset.py", line 110, in __getitem__
    return self.__getitem__((index + 1) % len(self.data))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/akilliceviribilisim/Oguz/F5-TTS/model/dataset.py", line 110, in __getitem__
    return self.__getitem__((index + 1) % len(self.data))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  [Previous line repeated 971 more times]
  File "/home/akilliceviribilisim/Oguz/F5-TTS/model/dataset.py", line 98, in __getitem__
    row = self.data[index]
          ~~~~~~~~~^^^^^^^
  File "/home/akilliceviribilisim/Env/tts/lib/python3.12/site-packages/datasets/arrow_dataset.py", line 2742, in __getitem__
    return self._getitem(key)
           ^^^^^^^^^^^^^^^^^^
  File "/home/akilliceviribilisim/Env/tts/lib/python3.12/site-packages/datasets/arrow_dataset.py", line 2727, in _getitem
    formatted_output = format_table(
                       ^^^^^^^^^^^^^
  File "/home/akilliceviribilisim/Env/tts/lib/python3.12/site-packages/datasets/formatting/formatting.py", line 639, in format_table
    return formatter(pa_table, query_type=query_type)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/akilliceviribilisim/Env/tts/lib/python3.12/site-packages/datasets/formatting/formatting.py", line 403, in __call__
    return self.format_row(pa_table)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/akilliceviribilisim/Env/tts/lib/python3.12/site-packages/datasets/formatting/formatting.py", line 444, in format_row
    row = self.python_features_decoder.decode_row(row)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/akilliceviribilisim/Env/tts/lib/python3.12/site-packages/datasets/formatting/formatting.py", line 222, in decode_row
    return self.features.decode_example(row) if self.features else row
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/akilliceviribilisim/Env/tts/lib/python3.12/site-packages/datasets/features/features.py", line 2045, in decode_example
    for column_name, (feature, value) in zip_dict(
                                         ^^^^^^^^^
  File "/home/akilliceviribilisim/Env/tts/lib/python3.12/site-packages/datasets/utils/py_utils.py", line 324, in zip_dict
    for key in unique_values(itertools.chain(*dicts)):  # set merge all keys
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RecursionError: maximum recursion depth exceeded

@ZhikangNiu

ZhikangNiu commented 3 days ago

Can you provide a minimal project to help me solve this problem?

emircanerkul commented 2 days ago

Modelleri paylaşırsanız çok makbule geçer @oztrkoguz :)

But can i learn where do you get those data, incase i want to use commercially, is that a problem? You know when someone says merhaba, and when i type merhaba it definetaly give me close result of that actor but not exactly same but nor completely different.. IDK how legal things works for this kind transformative techs

emircanerkul commented 2 days ago

BTW even it is public source and source is free, does that mean voice of the source also free? I have no idea, and i guess this kind legal/ethical things differ country to country

oztrkoguz commented 2 days ago

I created the dataset myself, unfortunately I cannot share the dataset, I have problems with model training, my work continues
@emircanerkul

zgongc commented 1 day ago

There are enough Turkish audio files for TTS training. I think the dev folder will be enough for this.

https://huggingface.co/datasets/mozilla-foundation/common_voice_17_0/tree/main/audio/tr