microsoft / SwinBERT

Research code for CVPR 2022 paper "SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning"
https://arxiv.org/abs/2111.13196
MIT License
237 stars 35 forks source link

Data split re MSVD and MSRVTT caption #22

Open dxli94 opened 2 years ago

dxli94 commented 2 years ago

Hi,

Congratulations on the great work!

Would you mind providing a pointer to where did you find the dataset split for the captioning datasets, as it seems they are not always consistent with the retrieval / QA counterparts.

@kevinlin311tw @linjieli222

Thanks. Dongxu

tiesanguaixia commented 1 year ago

Hi! i have the same question ,Did you solve it?