Obtain Tone Color Embedding
The source_se is the tone color embedding of the base speaker. It is an average of multiple sentences generated by the base speaker. We directly provide the result here but the readers feel free to extract source_se by themselves.
source_se = torch.load(f'{ckpt_base}/en_default_se.pth').to(device)
Based on the fact that I have both speaker and audiofiles, how can I generate "default_se.pth" file myself?
Dear Developers!
In the demo_part1.ipynb, it is written that
Based on the fact that I have both speaker and audiofiles, how can I generate "default_se.pth" file myself?