Some Issues with instructions and provided notebook

First of all, thanks for your work!

However, I have some issues following the instructions in the repo, as they are either misleading or contradictory.

The first thing that throws me is that the notebook to create the dataset creates a new metadata.csv, but the final instruction is to copy the train_list.txt and val_list.txt plus the segmented wavs to the Data folder for training.

What use was the following the whole notebook in this case? Or are you talking about the files created after the notebook is finished?

Also, the first cell in the notebook throws an error, which seems to be an API token that expired.

Also, it is not clear to me If I need to manually remove the file in the badAudio folder manually from the val_list and train_list files. I always get this error when running the final command:

soundfile.LibsndfileError: <exception str() failed>

Some digging suggested that there could be missing files - which would make sense if I copied the original val_list and train_list.

Can you offer some assistance?

First of all, thanks for your work!

However, I have some issues following the instructions in the repo, as they are either misleading or contradictory.

The first thing that throws me is that the notebook to create the dataset creates a new metadata.csv, but the final instruction is to copy the train_list.txt and val_list.txt plus the segmented wavs to the Data folder for training.

What use was the following the whole notebook in this case? Or are you talking about the files created after the notebook is finished?

Also, the first cell in the notebook throws an error, which seems to be an API token that expired.

Also, it is not clear to me If I need to manually remove the file in the badAudio folder manually from the val_list and train_list files. I always get this error when running the final command:

soundfile.LibsndfileError: <exception str() failed>

Some digging suggested that there could be missing files - which would make sense if I copied the original val_list and train_list.

Can you offer some assistance?

Ok so if you start over and follow the instructions on the readme you will notice that I mention what the curate notebook is for. It is not mandatory for training nor does it involve the creation of your dataset.

The curate notebook is more for an advanced level of dataset curation where we look at the standard distribution of a few different metrics, get a visual, then cull the dataset. To use it you would have to modify the stts2 train list and val list to fit with the expected ljspeech format of the notebook.

As for the bad audio folder. That contains segments that are too long or too short. You can discard them or resegment them as you wish.

The libsnfile error I would need to see the entirety of the traceback please.

Thanks for the swift reply!

Here is the stacktrace:

Traceback (most recent call last):
  File "/data/StyleTTS2/train_finetune_accelerate.py", line 714, in <module>
    main()
  File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/data/StyleTTS2/train_finetune_accelerate.py", line 265, in main
    for i, batch in enumerate(train_dataloader):
  File "/home/user/.local/lib/python3.10/site-packages/accelerate/data_loader.py", line 454, in __iter__
    current_batch = next(dataloader_iter)
  File "/home/user/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 631, in __next__
    data = self._next_data()
  File "/home/user/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1346, in _next_data
    return self._process_data(data)
  File "/home/user/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1372, in _process_data
    data.reraise()
  File "/home/user/.local/lib/python3.10/site-packages/torch/_utils.py", line 705, in reraise
    raise exception
soundfile.LibsndfileError: <exception str() failed>
Traceback (most recent call last):
  File "/home/user/.local/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/home/user/.local/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 48, in main
    args.func(args)
  File "/home/user/.local/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1106, in launch_command
    simple_launcher(args)
  File "/home/user/.local/lib/python3.10/site-packages/accelerate/commands/launch.py", line 704, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', 'train_finetune_accelerate.py', '--config_path', './Configs/config_ft.yml']' returned non-zero exit status 1.

But I'll also try to start again and see if that changes anything.

Thanks for the swift reply!

Here is the stacktrace:

Traceback (most recent call last):
  File "/data/StyleTTS2/train_finetune_accelerate.py", line 714, in <module>
    main()
  File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/data/StyleTTS2/train_finetune_accelerate.py", line 265, in main
    for i, batch in enumerate(train_dataloader):
  File "/home/user/.local/lib/python3.10/site-packages/accelerate/data_loader.py", line 454, in __iter__
    current_batch = next(dataloader_iter)
  File "/home/user/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 631, in __next__
    data = self._next_data()
  File "/home/user/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1346, in _next_data
    return self._process_data(data)
  File "/home/user/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1372, in _process_data
    data.reraise()
  File "/home/user/.local/lib/python3.10/site-packages/torch/_utils.py", line 705, in reraise
    raise exception
soundfile.LibsndfileError: <exception str() failed>
Traceback (most recent call last):
  File "/home/user/.local/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/home/user/.local/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 48, in main
    args.func(args)
  File "/home/user/.local/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1106, in launch_command
    simple_launcher(args)
  File "/home/user/.local/lib/python3.10/site-packages/accelerate/commands/launch.py", line 704, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', 'train_finetune_accelerate.py', '--config_path', './Configs/config_ft.yml']' returned non-zero exit status 1.

But I'll also try to start again and see if that changes anything.

What is the sample rate of your audio files? they should be 24khz, mono, 16bit

Thank you for the tip on the sample rate and frequency.

That might be a cause. I followed this YouTube tutorial https://youtu.be/5-Dk3ooxn2Q?si=SgqggJIyGza1lhhJ&t=1806 to download a video of mine to use for training. In the video, the sample and bitrate also get mentioned, but I didn't notice a step where the audio files would have been converted to the correct format - but I might have overlooked that. I'll try to get the files in the correct format and try again, if that does not fix it I will follow the steps again from the start.

Edit: Also another confusion point:

In the repo it tells you to keep the "OOD_list.txt file." But in the data folder there is an "_OODtexts.txt" file. Is that just a typo, or is that a different file from the LibreTTS dataset?

Edit #2: After installing an old version of Soundfile, where the error reporting isn't broken it seems I have this new and improved (tm) Stacktrace. The file path points to a file in the LJspeech folder, but I am unsure where these files should come from

Traceback (most recent call last):
  File "/data/StyleTTS2/train_finetune_accelerate.py", line 714, in <module>
    main()
  File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/data/StyleTTS2/train_finetune_accelerate.py", line 265, in main
    for i, batch in enumerate(train_dataloader):
  File "/home/user/.local/lib/python3.10/site-packages/accelerate/data_loader.py", line 454, in __iter__
    current_batch = next(dataloader_iter)
  File "/home/user/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 630, in __next__
    data = self._next_data()
  File "/home/user/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1344, in _next_data
    return self._process_data(data)
  File "/home/user/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1370, in _process_data
    data.reraise()
  File "/home/user/.local/lib/python3.10/site-packages/torch/_utils.py", line 706, in reraise
    raise exception
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/user/.local/lib/python3.10/site-packages/torch/utils/data/_utils/worker.py", line 309, in _worker_loop
    data = fetcher.fetch(index)  # type: ignore[possibly-undefined]
  File "/home/user/.local/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/user/.local/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 52, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/data/StyleTTS2/meldataset.py", line 110, in __getitem__
    wave, text_tensor, speaker_id = self._load_tensor(data)
  File "/data/StyleTTS2/meldataset.py", line 141, in _load_tensor
    wave, sr = sf.read(osp.join(self.root_path, wave_path))
  File "/home/user/.local/lib/python3.10/site-packages/soundfile.py", line 256, in read
    with SoundFile(file, 'r', samplerate, channels,
  File "/home/user/.local/lib/python3.10/site-packages/soundfile.py", line 629, in __init__
    self._file = self._open(file, mode_int, closefd)
  File "/home/user/.local/lib/python3.10/site-packages/soundfile.py", line 1183, in _open
    _error_check(_snd.sf_error(file_ptr),
  File "/home/user/.local/lib/python3.10/site-packages/soundfile.py", line 1357, in _error_check
    raise RuntimeError(prefix + _ffi.string(err_str).decode('utf-8', 'replace'))
RuntimeError: Error opening '/local/LJSpeech-1.1/wavs/out_111.wav': System error.

Traceback (most recent call last):
  File "/home/user/.local/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/home/user/.local/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 48, in main
    args.func(args)
  File "/home/user/.local/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1106, in launch_command
    simple_launcher(args)
  File "/home/user/.local/lib/python3.10/site-packages/accelerate/commands/launch.py", line 704, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', 'train_finetune_accelerate.py', '--config_path', './Configs/config_ft.yml']' returned non-zero exit status 1.

Ok, I found the error: Needed to also change the root path in the config_ft.yaml. Sorry for the oversight.

IIEleven11 / StyleTTS2FineTune

Some Issues with instructions and provided notebook #14