Open lgs00 opened 1 year ago
Hello, thank you for your contribution. I meet a question on _line 66 of the file models/spillava.py, image_forward_outs = vision_tower(images,output_hidden_states=True) What is the structure of this vision_tower?
image_forward_outs = vision_tower(images,output_hidden_states=True)
The vision tower is CLIP ViT-H/14.
Hello, thank you for your contribution. I meet a question on _line 66 of the file models/spillava.py,
image_forward_outs = vision_tower(images,output_hidden_states=True)
What is the structure of this vision_tower?