Open Ha0Tang opened 2 years ago
You can follow the code in conversion.ipynb
how to generate the metadata.pkl file?
I'm also interested in generating a metadata.pkl from my own data for inference. @auspicious3000 do you happen to have the script that produces the metadata.pkl that's used for inference, not for training?
Simply adding the mel spectograms produced by make_spect.py
and make_metadata.py
to the lists of speakers in make_metadata.py
instead of the paths does not produce sensible results with conversion.ipynb
, just noise.
Each metadata is a list of [filename, speaker embedding, spectrogram]
@ljc222 @Ha0Tang if you go through past issues, you'll stumble across this repo/notebook which puzzles it all together to make it work end-to-end: https://github.com/KnurpsBram/AutoVC_WavenetVocoder_GriffinLim_experiments/blob/master/AutoVC_WavenetVocoder_GriffinLim_experiments_17jun2020.ipynb
Hope this helps!
You can follow the code in conversion.ipynb
I looked at https://github.com/auspicious3000/autovc/blob/master/conversion.ipynb and I have zero idea how that helps with this.
I just need some way to, on the command line, provide the original voice as a wav/mp3, provide the file to change the voice of as wav/mp3, and get the output file with the voice changed written to disk.
How do I do that? How does anyone ever use this if something this basic isn't documented? Am I missing something obvious ?
Thanks a lot to anyone with any information.
@ljc222 @Ha0Tang if you go through past issues, you'll stumble across this repo/notebook which puzzles it all together to make it work end-to-end: https://github.com/KnurpsBram/AutoVC_WavenetVocoder_GriffinLim_experiments/blob/master/AutoVC_WavenetVocoder_GriffinLim_experiments_17jun2020.ipynb
@lisabecker hi! How do you generate the metadata for inference? make_metadata.py generates metadata for train, or in the end, this script is for generating metadata for both train and inference? (just with a different path for the mel-spec)
How to test on my own data? I have a "Source Speaker / Speech" and a "Target Speaker / Speech", I want to generate the "Conversion", as shown on the demo page https://auspicious3000.github.io/autovc-demo/. Can anyone provide some instructions?