jshilong / GPT4RoI

GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
Other
506 stars 25 forks source link

What is the structure of this vision_tower? #11

Open lgs00 opened 1 year ago

lgs00 commented 1 year ago

Hello, thank you for your contribution. I meet a question on _line 66 of the file models/spillava.pyimage_forward_outs = vision_tower(images,output_hidden_states=True) What is the structure of this vision_tower?

jshilong commented 1 year ago

The vision tower is CLIP ViT-H/14.