HW-AARC-CLUB / ICASSP_SER

17 stars 1 forks source link

When the code is open? :) #1

Closed SteveTanggithub closed 3 years ago

hemarathore commented 3 years ago

which paper?

SteveTanggithub commented 3 years ago

2021 ICASSP PAPER A NOVEL END-TO-END SPEECH EMOTION RECOGNITION NETWORK WITH STACKED TRANSFORMER LAYERS

hemarathore commented 3 years ago

@SteveTanggithub Thanks I dont think the author will reply, near future , This huge improvement in accuracy is not that easy. Accuracy on this dataset for audio only is 66% on whole 5531 utterances for 4 emotion classes. Anybody who is claiming for more should provide the code. otherwise its difficult to verify.

wsstriving commented 3 years ago

@SteveTanggithub Thanks I dont think the author will reply, near future , This huge improvement in accuracy is not that easy. Accuracy on this dataset for audio only is 66% on whole 5531 utterances for 4 emotion classes. Anybody who is claiming for more should provide the code. otherwise its difficult to verify.

To be honest, I don't believe the claim in this paper, the author should prove this unusual improvement or withdraw this paper.

zeroRoman commented 3 years ago

Please read our ReadMe

tidess commented 3 years ago

Please read our ReadMe

multi_branch is what ? where is code...

zxpoqas123 commented 3 years ago

Looking forward to your open source code... To be honest, I don't believe such a high performance simply by introducing the STLs.
I noticed that you have provided the data split code. But you should probably provide the EXACT csv files that contain train, valid and test data, respectively, to guarantee the reproduction. It's normal that inclusive speaker experiments can derive higher performance, but it's pretty exaggerated to obtain over 90% of UA and WA, simply using the acoustic information.

zeroRoman commented 3 years ago

Looking forward to your open source code... To be honest, I don't believe such a high performance simply by introducing the STLs. I noticed that you have provided the data split code. But you should probably provide the EXACT csv files that contain train, valid and test data, respectively, to guarantee the reproduction. It's normal that inclusive speaker experiments can derive higher performance, but it's pretty exaggerated to obtain over 90% of UA and WA, simply using the acoustic information.

Thank you for your query and we will update the rest of the code as soon as possible.

zeroRoman commented 3 years ago

Looking forward to your open source code... To be honest, I don't believe such a high performance simply by introducing the STLs. I noticed that you have provided the data split code. But you should probably provide the EXACT csv files that contain train, valid and test data, respectively, to guarantee the reproduction. It's normal that inclusive speaker experiments can derive higher performance, but it's pretty exaggerated to obtain over 90% of UA and WA, simply using the acoustic information.

Thank you for your query and we will update the rest of the code as soon as possible.