Closed yuanpanlifly closed 6 months ago
We used the standard CLIP weights from OpenAI for evaluation.
We used the standard CLIP weights from OpenAI for evaluation.
Thanks for the reply, may I ask if the accuracy is acc@1 or acc@5 for image classification zero-shot inference in the paper?
We reported the standard top-1 accuracy.
We reported the standard top-1 accuracy. Can you share your code for zero-shot inferencing?
Can you disclose the template for your prompts on image classification, please?
For Zero-shot inference in the paper, the results in your paper reported 68.62% and 77.96% for raw CLIP and Remote CLIP results on the AID dataset under the ViT-B-32 backbone, respectively. Using the same template-based prompting as you (a satellite photo of {class name}) my result is only 0.195% when using raw CLIP for inference. It's a big difference from your results, so I would like to ask if the CLIP in your paper is the CLIP after continuous training or the original model posted on the OpenAI website?