Closed raymin0223 closed 1 year ago
For MC, the learning scenario is similar to retrieval (text-video alignment).
MSRVTT-MC shares the same train/val data as MSRVTT-Retrieval. The trained checkpoint is also identical.
Oh I see, thank you very much for your reply!
Hi @tsujuifu, Thanks for your great work and tidy github for future works!
I found that MSRVTT-MC train- and val-datasets are missed in google drive, and also same for the checkpoint. It would be appreciated if you upload those things, thanks.