Beckschen / ViTamin

[CVPR 2024] Official implementation of "ViTamin: Designing Scalable Vision Models in the Vision-language Era"
Apache License 2.0
162 stars 5 forks source link

position of function forward_visual4llava #11

Closed QuLiao1117 closed 1 month ago

QuLiao1117 commented 1 month ago

Hi, thanks for sharing your great work! In llava training, I find out the function forward_visual4llava was used to encode clip feature, where can I find this function? Thanks

Beckschen commented 1 month ago

Thanks for your interests. The function is defined in the huggingface model.