Closed skol101 closed 1 year ago
I'm sorry for the inconvenience, this is a bug in the uploaded code version, in order to use sr, just change that line to
i = random.randint(68,92) c_filename = filename.replace(".wav", f"_{i}.npy")
I just changed the readme, thank you for your reminder.
https://github.com/OlaWod/FreeVC/blob/main/preprocess_sr.py creates .pt files (not .npy).
This because the difference between the Hubert-soft and the wavlm. The output of Hubert-soft is npy, but wavlm is pt
I see, so that means preprocess_sr must be also updated to Hubert-soft and re-run for the dataset.
Yes
Training time goes significantly up with SR augmentation...which is probably expected.
Actually, the most terrible part of SR augmentation is the data storage space... actually quickvc is already four months old research. Now I found a better content feature compared to Hubert-soft, the new content feature doesn't need augmentation and can achieve cross-lingual voice conversion. If you are interested, welcome to see consistencyvc
Looks like, unlike FreeVC https://github.com/OlaWod/FreeVC/blob/81c169cdbfc97ff07ee2f501e9b88d543fc46126/data_utils.py#L72C9-L72C9 your code doesn't explicitly use this param, in fact it doens't use SR at all?
https://github.com/quickvc/QuickVC-VoiceConversion/blob/277118de9c81d1689e16be8a43408eda4223553d/data_utils_new_new.py#L70