openai / CLIP

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
MIT License
24.55k stars 3.2k forks source link

How to Access the CLIP-pre-trained model KQV values? #409

Open dmrangak opened 9 months ago

dmrangak commented 9 months ago

hi, i am working with a CLIP-pre-trained model. To develop my algorithm, I want to access the KQV values. How should I access these values within this model? i use this pre-trained model

from PIL import Image import requests

from transformers import CLIPProcessor, CLIPModel

model = CLIPModel.from_pretrained("openai/clip-vit-base-patch32") processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")

data-node commented 4 months ago

hi, i am working with a CLIP-pre-trained model. To develop my algorithm, I want to access the KQV values. How should I access these values within this model? i use this pre-trained model

from PIL import Image import requests

from transformers import CLIPProcessor, CLIPModel

model = CLIPModel.from_pretrained("openai/clip-vit-base-patch32") processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")

Assuming you want the KQV for layer number zero (0-11), then do this:

model.vision_model.encoder.layers[0].self_attn.k_proj.weight.data
model.vision_model.encoder.layers[0].self_attn.q_proj.weight.data
model.vision_model.encoder.layers[0].self_attn.v_proj.weight.data