What is the structure of this vision_tower？

jshilong / GPT4RoI

GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest

Other

506 stars 25 forks source link

What is the structure of this vision_tower？ #11

Open lgs00 opened 1 year ago

lgs00 commented 1 year ago

Hello, thank you for your contribution. I meet a question on _line 66 of the file models/spillava.py， image_forward_outs = vision_tower(images,output_hidden_states=True) What is the structure of this vision_tower？

jshilong commented 1 year ago

The vision tower is CLIP ViT-H/14.