Closed LordSyd closed 4 months ago
First of all, thanks for your work!
However, I have some issues following the instructions in the repo, as they are either misleading or contradictory.
The first thing that throws me is that the notebook to create the dataset creates a new metadata.csv, but the final instruction is to copy the train_list.txt and val_list.txt plus the segmented wavs to the Data folder for training.
What use was the following the whole notebook in this case? Or are you talking about the files created after the notebook is finished?
Also, the first cell in the notebook throws an error, which seems to be an API token that expired.
Also, it is not clear to me If I need to manually remove the file in the badAudio folder manually from the val_list and train_list files. I always get this error when running the final command:
soundfile.LibsndfileError: <exception str() failed>
Some digging suggested that there could be missing files - which would make sense if I copied the original val_list and train_list.
Can you offer some assistance?
Ok so if you start over and follow the instructions on the readme you will notice that I mention what the curate notebook is for. It is not mandatory for training nor does it involve the creation of your dataset.
The curate notebook is more for an advanced level of dataset curation where we look at the standard distribution of a few different metrics, get a visual, then cull the dataset. To use it you would have to modify the stts2 train list and val list to fit with the expected ljspeech format of the notebook.
As for the bad audio folder. That contains segments that are too long or too short. You can discard them or resegment them as you wish.
The libsnfile error I would need to see the entirety of the traceback please.
Thanks for the swift reply!
Here is the stacktrace:
Traceback (most recent call last):
File "/data/StyleTTS2/train_finetune_accelerate.py", line 714, in <module>
main()
File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/data/StyleTTS2/train_finetune_accelerate.py", line 265, in main
for i, batch in enumerate(train_dataloader):
File "/home/user/.local/lib/python3.10/site-packages/accelerate/data_loader.py", line 454, in __iter__
current_batch = next(dataloader_iter)
File "/home/user/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 631, in __next__
data = self._next_data()
File "/home/user/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1346, in _next_data
return self._process_data(data)
File "/home/user/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1372, in _process_data
data.reraise()
File "/home/user/.local/lib/python3.10/site-packages/torch/_utils.py", line 705, in reraise
raise exception
soundfile.LibsndfileError: <exception str() failed>
Traceback (most recent call last):
File "/home/user/.local/bin/accelerate", line 8, in <module>
sys.exit(main())
File "/home/user/.local/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 48, in main
args.func(args)
File "/home/user/.local/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1106, in launch_command
simple_launcher(args)
File "/home/user/.local/lib/python3.10/site-packages/accelerate/commands/launch.py", line 704, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', 'train_finetune_accelerate.py', '--config_path', './Configs/config_ft.yml']' returned non-zero exit status 1.
But I'll also try to start again and see if that changes anything.
Thanks for the swift reply!
Here is the stacktrace:
Traceback (most recent call last): File "/data/StyleTTS2/train_finetune_accelerate.py", line 714, in <module> main() File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 1157, in __call__ return self.main(*args, **kwargs) File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 1078, in main rv = self.invoke(ctx) File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 1434, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 783, in invoke return __callback(*args, **kwargs) File "/data/StyleTTS2/train_finetune_accelerate.py", line 265, in main for i, batch in enumerate(train_dataloader): File "/home/user/.local/lib/python3.10/site-packages/accelerate/data_loader.py", line 454, in __iter__ current_batch = next(dataloader_iter) File "/home/user/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 631, in __next__ data = self._next_data() File "/home/user/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1346, in _next_data return self._process_data(data) File "/home/user/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1372, in _process_data data.reraise() File "/home/user/.local/lib/python3.10/site-packages/torch/_utils.py", line 705, in reraise raise exception soundfile.LibsndfileError: <exception str() failed> Traceback (most recent call last): File "/home/user/.local/bin/accelerate", line 8, in <module> sys.exit(main()) File "/home/user/.local/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 48, in main args.func(args) File "/home/user/.local/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1106, in launch_command simple_launcher(args) File "/home/user/.local/lib/python3.10/site-packages/accelerate/commands/launch.py", line 704, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['/usr/bin/python3', 'train_finetune_accelerate.py', '--config_path', './Configs/config_ft.yml']' returned non-zero exit status 1.
But I'll also try to start again and see if that changes anything.
What is the sample rate of your audio files? they should be 24khz, mono, 16bit
Thank you for the tip on the sample rate and frequency.
That might be a cause. I followed this YouTube tutorial https://youtu.be/5-Dk3ooxn2Q?si=SgqggJIyGza1lhhJ&t=1806 to download a video of mine to use for training. In the video, the sample and bitrate also get mentioned, but I didn't notice a step where the audio files would have been converted to the correct format - but I might have overlooked that. I'll try to get the files in the correct format and try again, if that does not fix it I will follow the steps again from the start.
Edit: Also another confusion point:
In the repo it tells you to keep the "OOD_list.txt file." But in the data folder there is an "_OODtexts.txt" file. Is that just a typo, or is that a different file from the LibreTTS dataset?
Edit #2: After installing an old version of Soundfile, where the error reporting isn't broken it seems I have this new and improved (tm) Stacktrace. The file path points to a file in the LJspeech folder, but I am unsure where these files should come from
Traceback (most recent call last):
File "/data/StyleTTS2/train_finetune_accelerate.py", line 714, in <module>
main()
File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/data/StyleTTS2/train_finetune_accelerate.py", line 265, in main
for i, batch in enumerate(train_dataloader):
File "/home/user/.local/lib/python3.10/site-packages/accelerate/data_loader.py", line 454, in __iter__
current_batch = next(dataloader_iter)
File "/home/user/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 630, in __next__
data = self._next_data()
File "/home/user/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1344, in _next_data
return self._process_data(data)
File "/home/user/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1370, in _process_data
data.reraise()
File "/home/user/.local/lib/python3.10/site-packages/torch/_utils.py", line 706, in reraise
raise exception
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/user/.local/lib/python3.10/site-packages/torch/utils/data/_utils/worker.py", line 309, in _worker_loop
data = fetcher.fetch(index) # type: ignore[possibly-undefined]
File "/home/user/.local/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/user/.local/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 52, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/data/StyleTTS2/meldataset.py", line 110, in __getitem__
wave, text_tensor, speaker_id = self._load_tensor(data)
File "/data/StyleTTS2/meldataset.py", line 141, in _load_tensor
wave, sr = sf.read(osp.join(self.root_path, wave_path))
File "/home/user/.local/lib/python3.10/site-packages/soundfile.py", line 256, in read
with SoundFile(file, 'r', samplerate, channels,
File "/home/user/.local/lib/python3.10/site-packages/soundfile.py", line 629, in __init__
self._file = self._open(file, mode_int, closefd)
File "/home/user/.local/lib/python3.10/site-packages/soundfile.py", line 1183, in _open
_error_check(_snd.sf_error(file_ptr),
File "/home/user/.local/lib/python3.10/site-packages/soundfile.py", line 1357, in _error_check
raise RuntimeError(prefix + _ffi.string(err_str).decode('utf-8', 'replace'))
RuntimeError: Error opening '/local/LJSpeech-1.1/wavs/out_111.wav': System error.
Traceback (most recent call last):
File "/home/user/.local/bin/accelerate", line 8, in <module>
sys.exit(main())
File "/home/user/.local/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 48, in main
args.func(args)
File "/home/user/.local/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1106, in launch_command
simple_launcher(args)
File "/home/user/.local/lib/python3.10/site-packages/accelerate/commands/launch.py", line 704, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', 'train_finetune_accelerate.py', '--config_path', './Configs/config_ft.yml']' returned non-zero exit status 1.
Ok, I found the error: Needed to also change the root path in the config_ft.yaml. Sorry for the oversight.
First of all, thanks for your work!
However, I have some issues following the instructions in the repo, as they are either misleading or contradictory.
The first thing that throws me is that the notebook to create the dataset creates a new metadata.csv, but the final instruction is to copy the train_list.txt and val_list.txt plus the segmented wavs to the Data folder for training.
What use was the following the whole notebook in this case? Or are you talking about the files created after the notebook is finished?
Also, the first cell in the notebook throws an error, which seems to be an API token that expired.
Also, it is not clear to me If I need to manually remove the file in the badAudio folder manually from the val_list and train_list files. I always get this error when running the final command:
soundfile.LibsndfileError: <exception str() failed>
Some digging suggested that there could be missing files - which would make sense if I copied the original val_list and train_list.
Can you offer some assistance?