NVlabs / VILA

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)
Apache License 2.0
973 stars 68 forks source link

Provide ShareGPT4V filtered annotations file #50

Closed hubenjm closed 15 hours ago

hubenjm commented 2 months ago

In datasets_mixture.py there is references a .json file that is not entirely clear where it came from based on the name: https://github.com/Efficient-Large-Model/VILA/blob/main/llava/data/datasets_mixture.py#L62

Is this file the same as https://huggingface.co/datasets/mit-han-lab/ShareGPT4V/blob/main/filter-share-captioner_coco_lcs_sam_1246k_1107.json?

if not, can you provide this file or some description of how it was generated?

Lyken17 commented 15 hours ago

That's correct. You may use the https://huggingface.co/datasets/mit-han-lab/ShareGPT4V/blob/main/filter.py to process the Sharegpt4V.