pharmapsychotic / clip-interrogator

Image to prompt with BLIP and CLIP
MIT License
2.58k stars 429 forks source link

Text similarities are not in descending order #68

Closed gunesevitan closed 1 year ago

gunesevitan commented 1 year ago

I was debugging my way around and noticed something. I create image features like this

features = ci.image_to_features(image)

and retrieve top-k items

top_labels = ci.mediums.rank(features, 10, reverse=False)

When I check the similarities using ci.similarities(features, top_labels)

I get values like this

[0.1334228515625, 0.1431884765625, 0.12371826171875, 0.1591796875, 0.267578125, 0.1026611328125, 0.140380859375, 0.1810302734375, 0.094482421875, 0.1348876953125]

Aren't those values supposed be in descending order since top_labels are selected with torch.topk?

gunesevitan commented 1 year ago

I finally understood why that happens. _rank method in LabelTable class normalizes text_features with all embeddings but similarities method in Interrogator class normalizes text_features only top-k selected embeddings. Discrepancy isn't a problem here so I'm closing this.