Guide to fine-tuning ReDimNet

IDRnD / ReDimNet

The official pytorch implemention of the Intespeech 2024 paper "Reshape Dimensions Network for Speaker Recognition"

MIT License

117 stars 5 forks source link

Guide to fine-tuning ReDimNet #14

Open pongthang opened 2 weeks ago

pongthang commented 2 weeks ago

Hi, as Redimnet was not trained in kids data, accuracy drops (60-70%) when I tried in kids dataset. So I want to fine-tune the model in kids voice. Could you help in this? Could you provide the training script or may be training script example? and some wiki how you train it. Btw it is a good project. Thanks.

vanIvan commented 1 week ago

Hi @pongthang , thank you for taking look at our work! We were using slightly modified version of wespeaker pipeline, we'll discuss in our team, when we could publish it. Foe now please refer to wespeaker pipeline, the only main difference we have - we removed feature calculation from data loading pipeline, and inserted them into the model. I believe wespeaker pipeline has added ReDimNet architecture and recipes for it.

pongthang commented 1 week ago

Thank you . If you could share the training process it will be great. Yeah, wespeaker pipeline has added ReDimNet architecture and recipes for it. I will follow this, thank you again for your support.

vanIvan commented 1 week ago

We'll be soon releasing few more models trained on much larger dataset. They should have better quality across other domains. Stay tuned.

pongthang commented 1 week ago

We'll be soon releasing few more models trained on much larger dataset. They should have better quality across other domains. Stay tuned.

This is great. How can I know new models are released ?

MonolithFoundation commented 2 days ago

@vanIvan Will it support Chinese speaker verification? Any estimate on this? Really hoping for his.

vanIvan commented 2 days ago

Yes, new models would be pretrained on voxblink2 and finetuned on voxblink2+vox2+cnceleb. They will perform better on Chinese.

pongthang commented 1 day ago

@vanIvan , Hi , are you planning to train smaller models which have similar size to b0 or b1 on the new dataset ?

MonolithFoundation commented 1 day ago

@vanIvan sounds extremly good. Is there any estimated on why will new models release?

vanIvan commented 1 day ago

@MonolithFoundation we have released two new models yesterday, S and M models trained on voxblink2 dataset, please check readme and evaluation pages for more information.

vanIvan commented 1 day ago

@pongthang No, we were not planning to train smaller models on voxblink2 dataset.

MonolithFoundation commented 1 day ago

@vanIvan Sorry, I typoed, means when will release the models that combines voxblink2+vox2+cnceleb

vanIvan commented 1 day ago

@MonolithFoundation we have already released them yesterday, both models S and M have two sets of weights: pretrained version on voxblink2 and finetuned version on voxblink2+vox2+cnceleb12

MonolithFoundation commented 1 day ago

Thank you so much. Is there any inference script that can be used to infer on these models?