TXH-mercury / VAST

Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset
https://arxiv.org/abs/2305.18500
MIT License
239 stars 17 forks source link

"/data/IndexAnno.py", "VQA-msrvtt.json", and "descs_qa_trainval.json" #22

Closed wonzin closed 5 months ago

wonzin commented 6 months ago

Hi. I am trying to finetune the MSRVTT-QA.

However, it has an error, I can modify the grammar to get rid of the error but I am not sure that I understand right.

line 68 of "/data/IndexAnno.py" raw_captions = anno['desc'] if 'desc' in anno else anno['caption'] it returns error for the case of MSRVTT-QA.

Simply, because 'descs_qa_trainval.json' does not contain 'desc' nor 'caption'.

{"video_id": "video0", "question": "who drives down the road in an audi?", "answer": "man", "subtitle": ""}, {"video_id": "video0", "question": "what is a man doing?", "answer": "show", "subtitle": ""}, {"video_id": "video0", "question": "what is a man silently narrates his experience doing?", "answer": "drive", "subtitle": ""}, {"video_id": "video0", "question": "what is a person doing?", "answer": "drive", "subtitle": ""}, {"video_id": "video0", "question": "what is a person doing?", "answer": "tell", "subtitle": ""}, {"video_id": "video0", "question": "what is guy doing?", "answer": "drive", "subtitle": ""}, {"video_id": "video0", "question": "what is man doing?", "answer": "talk", "subtitle": ""}, {"video_id": "video0", "question": "what is the man doing?", "answer": "drive", "subtitle": ""}, {"video_id": "video0", "question": "what is a man doing?", "answer": "drive", "subtitle": ""}, {"video_id": "video0", "question": "what is shown?", "answer": "car", "subtitle": ""}, {"video_id": "video0", "question": "what is dancing?", "answer": "group", "subtitle": ""}, {"video_id": "video0", "question": "who is driving?", "answer": "man", "subtitle": ""}, {"video_id": "video0", "question": "what is a man driving?", "answer": "car", "subtitle": ""}

Can I substitute the 'subtitle' for 'desc'/'caption' for the "/data/indexAnno.py" line 68? but not sure as many 'subtitle' is empty.

leminhhuan72 commented 6 months ago

sorry, I have the same problem. Have you had solution ?

wonzin commented 5 months ago

sorry, I have the same problem. Have you had solution ?

No I have not :(

wonzin commented 5 months ago

This issue is same issue with the previous resolved ones. https://github.com/TXH-mercury/VAST/issues/16 You can simply let raw_captions empty for qa tasks.