Open zhao1025 opened 11 months ago
Hello,
Sorry for the late reply. The latest GAN (v2.0) does not generate ECAPA-TDNN vectors but a custom speaker style embedding based on Global Style Tokens. There is a previous version (v1.2) that was trained on concatenated ECAPA-TDNN and x-vector embeddings. Make sure to use the code in the gan_embeddings branch for this model. These concatenated training embeddings have 704 dimensions, of which the first 192 dimensions are the ECAPA-TDNN embedding and the last 512 dimensions are the x-vector of the training speaker. The output of the gan.pt are also 704-dimensional embeddings, however, I don't know if the GAN learned that the first 192 dimensions should resemble ECAPA-TDNN. You could try it, generate a speaker embedding, extract the first 192 dimensions, and treat this vector as ECAPA-TDNN embedding. Let me know if you do this and whether it works.
The repository currently does not contain the code for training the GAN. I plan to add it soon and will inform you about it once this is done.
Okay, if I make this attempt, I will inform you of the results. Thank you for your reply!
Hello, I would like to use ecapa vectors for anonymization. Can the gan.pt file you provided be directly used for the anonymization training of ecapa vectors? Or do you need to train a new gan.pt file separately? If retraining is required, could you please inform us of the training method? thank you.