Hi @wenxie18, sorry for late response. Please use the EgoVLPv2 checkpoint for fine-tuning on QFVS. We release the fully processed QFVS dataset here, and the fine-tuning needs just about ~15 minutes even on 1 GPU. Do not hesitate to reach out if you have any other questions.
QFVS task: Can you please share pre-trained models for inference? Thank you.