Open pongthang opened 2 weeks ago
Hi @pongthang , thank you for taking look at our work! We were using slightly modified version of wespeaker pipeline, we'll discuss in our team, when we could publish it. Foe now please refer to wespeaker pipeline, the only main difference we have - we removed feature calculation from data loading pipeline, and inserted them into the model. I believe wespeaker pipeline has added ReDimNet architecture and recipes for it.
Thank you . If you could share the training process it will be great. Yeah, wespeaker pipeline has added ReDimNet architecture and recipes for it. I will follow this, thank you again for your support.
We'll be soon releasing few more models trained on much larger dataset. They should have better quality across other domains. Stay tuned.
We'll be soon releasing few more models trained on much larger dataset. They should have better quality across other domains. Stay tuned.
This is great. How can I know new models are released ?
@vanIvan Will it support Chinese speaker verification? Any estimate on this? Really hoping for his.
Yes, new models would be pretrained on voxblink2 and finetuned on voxblink2+vox2+cnceleb. They will perform better on Chinese.
@vanIvan , Hi , are you planning to train smaller models which have similar size to b0 or b1 on the new dataset ?
@vanIvan sounds extremly good. Is there any estimated on why will new models release?
@MonolithFoundation we have released two new models yesterday, S and M models trained on voxblink2 dataset, please check readme and evaluation pages for more information.
@pongthang No, we were not planning to train smaller models on voxblink2 dataset.
@vanIvan Sorry, I typoed, means when will release the models that combines voxblink2+vox2+cnceleb
@MonolithFoundation we have already released them yesterday, both models S and M have two sets of weights: pretrained version on voxblink2 and finetuned version on voxblink2+vox2+cnceleb12
Thank you so much. Is there any inference script that can be used to infer on these models?
Hi, as Redimnet was not trained in kids data, accuracy drops (60-70%) when I tried in kids dataset. So I want to fine-tune the model in kids voice. Could you help in this? Could you provide the training script or may be training script example? and some wiki how you train it. Btw it is a good project. Thanks.