Open yuanlorna opened 3 years ago
As explained in the readme, "This project only shows how to generate speaker embeddings using pre-trained model for uis-rnn training in later." I suppose information on how to train the speaker embeddings can be found in this repo or this one (official VGG code). It was trained on openslr, VCTK and VoxCeleb datasets.
Concerning the pretrained UISRNN weights, I assume it was trained on the 4 speakers dataset available in this repo, but it'd be great to have a clear confirmation.
May I ask how "./ghostvlad/training_data.npz" is generated, In addition, which dataset is the "pretrained/saved_model.uisrnn_benchmark" model trained on?