arampacha / CLIP-rsicd

Apache License 2.0
208 stars 31 forks source link

Measure generalization capabilities of CLIP-RSICD model #31

Open sujitpal opened 3 years ago

sujitpal commented 3 years ago

We want to measure the model's ability to generalize beyond the 30 classes it was trained with. Idea is to take an aerial image of a subject that is not covered by the 30 classes it trained on, and measure its performance against our CLIP-RSICD model, and compare against baseline CLIP model. Evaluation metric used can be similar to the one we used for our original evaluation, i.e. rank of synthetic caption containing the correct class, averaged across all test images.

FMoW may be a good source of aerial images that deal with classes outside the RSICD training set.

INF800 commented 2 years ago

Hi, I was wondering if there are any improvements to clip based architectures in recent advancements in multimodal retrieval. Are there any?