ljwztc / CLIP-Driven-Universal-Model

[ICCV 2023] CLIP-Driven Universal Model; Rank first in MSD Competition.
Other
565 stars 67 forks source link

Extract vision embeddings #61

Closed hep-raidium closed 4 months ago

hep-raidium commented 8 months ago

Hi there, First, thank you for sharing your work!

I would like to extract one embedding for each image from the vision encoder. I didn't succeed in doing so, have you ever done it ? If yes, do you have a few hints / code to share ?

Thx!!

ljwztc commented 8 months ago

After obtaining the feature map, you can scale up to the input size. Then you can find the feature vector for each category with ground truth.