XuecWu / eMotions

Official repository for "eMotions: A Large-Scale Dataset for Emotion Recognition in Short Videos"
Apache License 2.0
26 stars 0 forks source link

对比实验数据问题和数据集访问 #3

Closed ChiChivas closed 6 months ago

ChiChivas commented 6 months ago

你好,我对你们的工作很感兴趣! 我注意到你们在论文里引用了CVPR2023的《Weakly Supervised Video Emotion Detection and Prediction via Cross-Modal Temporal Erasing Network》(参考文献[65]),但在实际对比实验的时候没有给出它的数据进行对比。请问你们的工作和他们比起来怎么样呢? 其次,VAANet(参考文献[66])论文里给出的ve8和ek6数据集上的acc比你们表格中列出的结果高不止5个点,请问这个差距是因为你们的实验方法和VAANet论文中用的不一样吗?差别在哪里呢? 最后,我对你们的eMotions数据集很感兴趣,请问需要哪些许可才能访问呢?我愿意按要求提供所需材料。 感谢你的耐心阅读,谢谢~

XuecWu commented 6 months ago

Thank you for your interest in our work. 1. We cite [65] in the paper to indicate that AV-CPNet is different from the visual backbone they deploy. In addition, our AV-CPNet provides benchmark results for eMotions and is not designed to carry out absolute performance comparisons with other SOTA methods. In the future, we will provide the comparison results with [65]. 2. The implementation details are placed in the appendix. Following Video Swin-T [1], we use the optimization strategy of AdamW and weight decay=0.2, which is inconsistent with the optimization strategy described in VAANet [2]. Besides, we also deploy the optimization strategy in VAANet [2] to compare the performance of proposed AV-CPNet and VAANet [2] in appendix. 3. eMotions will be released after completing the final review and formulating the relevant acquisition rules.

Reference: [1] Liu Z, Ning J, Cao Y, et al. Video swin transformer[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022: 3202-3211. [2] Zhao S, Ma Y, Gu Y, et al. An end-to-end visual-audio attention network for emotion recognition in user-generated videos[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2020, 34(01): 303-311. [65] Zhang Z, Wang L, Yang J. Weakly Supervised Video Emotion Detection and Prediction via Cross-Modal Temporal Erasing Network[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023: 18888-18897.