clovaai / voxceleb_trainer

In defence of metric learning for speaker recognition
MIT License
1.02k stars 272 forks source link

Res2net model support #91

Closed stevenhillis closed 2 years ago

stevenhillis commented 3 years ago

Include support for Res2Net models, as used in microsoft 2020 diarization system: https://arxiv.org/pdf/2010.11458.pdf

Shane-pe commented 3 years ago

Cheers, thank you for your contribution.

009deep commented 3 years ago

Good attempt. Few suggestions:

  1. Model you used is Res2Net50 and not 34, so rename would be good.
  2. It'd be good to parameterize use of SE layer, basewidth and scale values.
  3. Run code to make sure, no bugs. Also, to note that Microsoft used Res2Net blocks but overall backbone architecture was different form one presented in Res2Net paper. Details of their architecture can be found in references of the paper.