VITA-Group / AutoSpeech

[InterSpeech 2020] "AutoSpeech: Neural Architecture Search for Speaker Recognition" by Shaojin Ding*, Tianlong Chen*, Xinyu Gong, Weiwei Zha, Zhangyang Wang
https://arxiv.org/abs/2005.03215
MIT License
208 stars 42 forks source link

Trained Model? #2

Closed Joepetey closed 4 years ago

Joepetey commented 4 years ago

Any chance you guys could upload the model weights for this?

shaojinding commented 4 years ago

We will release the pre-trained model soon. Sorry for the wait.

Joepetey commented 4 years ago

Awesome thank you! Any chance you could provide an estimated guess of the release date. Just want to know if i should train myself or wait for the weights.

shaojinding commented 4 years ago

Currently, I'm doing an internship outside the university. I will be able to access to models during the start of the fall semester. At the same time, I will also check with my colleagues to see if they can access the models and release them.

czy97 commented 4 years ago

Awesome thank you! Any chance you could provide an estimated guess of the release date. Just want to know if i should train myself or wait for the weights.

Hey, buddy. Are you reproducing the code now? I run the code, but the results are much worse than the results in the paper.

shaojinding commented 4 years ago

Awesome thank you! Any chance you could provide an estimated guess of the release date. Just want to know if i should train myself or wait for the weights.

Hey, buddy. Are you reproducing the code now? I run the code, but the results are much worse than the results in the paper.

Please do a git pull before data processing. Previously, it extracts log spectrogram, which is different from how to obtained the results (we used magnitude spectrogram instead). We found using log spectrogram will lead to much worse results. Also, please train baselines before the proposed method to make sure data is correctly processed.

czy97 commented 4 years ago

Awesome thank you! Any chance you could provide an estimated guess of the release date. Just want to know if i should train myself or wait for the weights.

Hey, buddy. Are you reproducing the code now? I run the code, but the results are much worse than the results in the paper.

Please do a git pull before data processing. Previously, it extracts log spectrogram, which is different from how to obtained the results (we used magnitude spectrogram instead). We found using log spectrogram will lead to much worse results. Also, please train baselines before the proposed method to make sure data is correctly processed.

Ok, thanks.

ChokJohn commented 4 years ago

Although I followed the steps to reproduce, but I still can not get a good result. For speaker verification, the current result is about 11%

czy97 commented 4 years ago

Although I followed the steps to reproduce, but I still can not get a good result. For speaker verification, the current result is about 11%

Me too.

ChokJohn commented 4 years ago

@shaojinding I am very interested in your work. As far as I know, you are the first one to achieve such good results only by improving the model structure. Could you provide more detail for me to reproduce the 1.4% EER?

shaojinding commented 4 years ago

@ChokJohn @czy97 Sorry for the late reply. We were following the wrong convention in the previous experiments, as discussed at here. We have updated the results using the correct setting and release the model. Please see the updated README.MD page for the new results and trained model.

shaojinding commented 4 years ago

Awesome thank you! Any chance you could provide an estimated guess of the release date. Just want to know if i should train myself or wait for the weights.

Just FYI, we have updated the results using the correct data split and released the trained model.