jayleicn / singularity

[ACL 2023] Official PyTorch code for Singularity model in "Revealing Single Frame Bias for Video-and-Language Learning"
https://arxiv.org/abs/2206.03428
MIT License
130 stars 14 forks source link

Why do you train "msrvtt_mc" with "msrvtt_ret"? #23

Closed wonjiny closed 1 year ago

wonjiny commented 1 year ago

Hi.

configs/ret_msrvtt_mc.yaml

data_root: ${oc.env:SL_DATA_DIR}/videos_images anno_root_downstream: ${oc.env:SL_DATA_DIR}/anno_downstream train_file: ['${anno_root_downstream}/msrvtt_ret_train7k.json', '${data_root}/msrvtt_2fps_224', vidtest_types: [mc_test, ] test_file: mc_test: ['${anno_root_downstream}/msrvtt_mc_test.json', '${data_root}/msrvtt_2fps_224', video] stop_key: None # used to choose the best ckpt. If None, save the last.


The format of 'msrvtt_ret_train7k.json' and 'msrvtt_mc_test.json' is totally different.

'msrvtt_ret_train7k.json' {"video": "video2960.mp4", "caption": "a cartoon animals runs through an ice cave in a video game", "duration": 12.32},

In contrast, 'msrvtt_mc_test.json' contains 5 captions and 1 answer.

{"video": "video9770.mp4", "caption": ["the boy is trying to fix the problem", "a movie trailer shows various scenes from a movie", "asian man discusses technology in the younger generations", "two men on wave runner in ocean rescuing a surfer", "a group is dancing"], "answer": 0},

Why do you train "msrvtt_mc" task with "msrvtt_ret_train7k.json"?

Isn't there a train file for "msrvtt_mc"?

jayleicn commented 1 year ago

The msrvtt_mc task is essentially a simplified version of the retrieval task msrvtt_ret with a much smaller retrieval pool. They can use the same training checkpoint for inference.