Comparison between SoTA methods

TXH-mercury / VALOR

Codes and Models for VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset

MIT License

259 stars 16 forks source link

Hi, I have read your paper, nice work on various video downstream tasks. However, some of the major or competitive methods are not compared for VideoQA (such as MulTI, mPLUG-2, and UMT-L) and VideoCaption (such as HiTeA, and mPLUG-2). These methods are also SoTA methods and worth for comparison.

Hope you can consider above suggestions, thanks.

Thanks for the advice. I have already complished VALOR for quite a while. Those latest methods as you mentioned will be compared in our latest works, which will be released next month. Thanks for the attention.

TXH-mercury / VALOR

Comparison between SoTA methods #2