Stage 1 training datasets format

OpenGVLab / Ask-Anything

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

https://vchat.opengvlab.com/

MIT License

3k stars 247 forks source link

Stage 1 training datasets format #52

Closed crqcrq001 closed 10 months ago

crqcrq001 commented 1 year ago

Sorry I am new to this repo. If I want to reproduce stage1, how can I prepare training dataset.Similar question as

https://github.com/OpenGVLab/Ask-Anything/issues/46

crqcrq001 commented 1 year ago

https://github.com/OpenGVLab/Ask-Anything/issues/46#issue-1754442730

Andy1621 commented 1 year ago

The format is the same as in ALBEF and VindLU.

image_data = [
  {'image': image_path, 'caption': caption_content},
  {'image': image_path, 'caption': caption_content},
]
video_data = [
  {'video': video_path, 'caption': caption_content},
  {'video': video_path, 'caption': caption_content},
]

You can find the example here.