rikeilong / Bay-CAT

[ECCV’24] Official Implementation for CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios
Apache License 2.0
41 stars 1 forks source link

About table2 comparison on Music-AVQA dataset #4

Open Cece1031 opened 4 months ago

Cece1031 commented 4 months ago

I didn't quite understand how you compare with the prior model? Could you tell me your email so I can ask about this in detail?Thank u very much

rikeilong commented 4 months ago

We detail our experiments on Music AVQA in Appendix (released soon). Briefly, we do not use in-context reasoning, but follow the baseline recognition paradigm for classification.

dragonlzm commented 6 days ago

I wonder which split you use for evaluation for music AVQA dataset? Test ot Val? Thanks!