Closed stohrendorf closed 1 year ago
Hey. For the first issue, please check that you have downloaded the "priors" data, from nexusmods. They are currently not shipped with the Steam build
Thanks for the clarification, this gave me a few headaches. Consider the first issue to be a wish for a better error message now ;)
Update: sorry, but downloading the data files and extracting them didn't solve the issue. I have downloaded both data files and extracted them using 7zip, but it's showing up the same error with the same stacktrace. This is the directory layout after extracting:
Update: Windows and 7zip struggled so much that (for some reason) it showed files that were not actually there.
Is it working ok now?
If not, the other issue might be finetuning dataset formatting, if the audio files can't be found by the app. To clarify, there should be wav files inside the "wavs" folder, and next to the "wavs" folder there is a metadata.csv file with
So in the app, in the training config, the dataset path for "ar_priors_x", if hypothetically that was your custom dataset, should be ...../resources/app/xvapitch/PRIORS/ar_priors_x
The dataset itself isn't the problem, it's about adding training configurations. I have added copies of the dataset with cleaned up WAVs from different folders already, but when training tasks are in the list, and I try to add another one, it just doesn't appear in the list, no matter whether I re-use a dataset from another training task, or if I select an unused dataset.
On a side note, the training stopped after a few hours with an OOM (system RAM, not VRAM), which was a bit surprising given that I have 64GB.
There is currently a fair bit of data caching in the dataloader. I've removed it for the next update, but meanwhile you can reduce the number of workers in the training config, which should use up less RAM.
For the training queue, would you please be able to share the app.log file located next to the xVATrainer.exe? It might have an error stack to indicate what's wrong.
Hm. I can't reproduce it anymore now, yet I'm certain it happened multiple times. The log file is fairly uninteresting, except for this single line, but that's probably just an invalid training configuration: [line: 702] onerror: Uncaught TypeError: trainingAddConfigCkptPathInput.replaceAll is not a function
.
About the data caching, I have moved every reference data set except for the speaker's language to a different folder, which still eats ~30 GB of RAM, but it doesn't go OOM anymore - are there consequences for excluding most "priors" datasets?
Just fixed that error. As for not including priors, every priors folder contains some synthetic data for a different language. This data is used during training to ensure that the models you fine-tune on mono-language, and mono-speaker-style will not lose any knowledge of the other languages, nor vocal range (useful for voice conversion, pitch/emotion/style manipulation). I recommend not messing with the priors datasets, unless you choose to add MORE data to them (eg your own, higher quality non-synthetic data).
I'm pushing an update through today, which makes the training consume less system RAM.
Oh, okay, that may explain why the voice synthesis I did is so emotionless. Thanks for that explanation. I'm wondering whether re-training from scratch or just adding the additional priors later on makes a huge difference, though.
As a side note, I created a new base dataset with basically complete sentences of reference voice, and the UI hint about VRAM and batch size ratio doesn't align with that, i.e. it went OOM with a batch size of 12 (I have 12 GB of VRAM) - using the voice samples I have, and using a batch size of 8, it uses about 10 GB of VRAM, and around 30 GB of system RAM (although that's because I removed every other language from the priors). I'm just guessing here, but it seems that at certain points, it needs an additional 1-2 GB just to save the checkpoints, which in turn leads to an OOM if the batch size is too large. In other words, it seems like the UI hint about the batch size seems to be a bit misleading.
Multiple issues. Checking data files through Steam didn't show any error. Cleaning up the dataset didn't help.
The line numbers there don't match up with
xva_train.py
, changes to that file to debug this are completely ignored, whereas changes to e.g.dataset.py
are working fine. Throwing an exception inread_datasets
shows that at least one point it's returning the correct dataset.