BAAI-DCAI / Visual-Instruction-Tuning

SVIT: Scaling up Visual Instruction Tuning
MIT License
163 stars 4 forks source link

On descriptions #6

Closed John-Ge closed 1 year ago

John-Ge commented 1 year ago

Thank you for your awosome work. I notice that your svit descriptions do not cover all images in vg. Do you filter out these images for some reason(low quality e.g.)? Or just not include them? Thank you!

BoyaWu10 commented 1 year ago

Hi @John-Ge, thanks for your interest in this repo. For detailed description, the dataset contains around 106K instances, where around 2K instances are filtered out for better quality. For example, some answers generated by GPT-4 may tell that the information is based on the given "captions" and "descriptions". We exclude those instances from the result.