foundation-multimodal-models / CAL

Official PyTorch Implementation of Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment
Apache License 2.0
44 stars 2 forks source link

Where can find llava_v1_6.json #5

Open jameslahm opened 2 months ago

jameslahm commented 2 months ago

Thanks for your great work! Since llava_v1_6.json is not released in LLaVA-NeXT, could you please give me some guidance about how to obtain the llava_v1_6.json you used in the script below? https://github.com/foundation-multimodal-models/CAL/blob/1780b5869218e6913ca5464f7f0e4984c41f2714/run_scripts/llava16_7b.sh#L15 Thanks a lot!

Menoly-xin commented 2 months ago

Thank you for your interest!

Due to certain constraints, we are unable to share the data we use internally, even though it may be publicly accessible. We recommend using the data from Llava15; however, please be aware that the image processing method in Llava16 differs from that of Llava15. This difference necessitates modifications in the representation of bounding boxes.