azshue / TPT

Test-time Prompt Tuning (TPT) for zero-shot generalization in vision-language models (NeurIPS 2022))
https://azshue.github.io/TPT/
MIT License
136 stars 16 forks source link

Imagenet Inference time #12

Closed dh58319 closed 9 months ago

dh58319 commented 9 months ago

First of all I was able to get a lot of insights from this study, thank you.

I've been trying to reproduce it, and I have a question regarding iteration time. The paper mentions an iter/sec of 0.25 seconds, but when I use the model myself, it seems to take longer than that. Could you explain what kind of environment could yield a figure of 0.25 seconds?

azshue commented 9 months ago

Hi,

Thank you for being interested in our work.

We use ImageNet-A by default, so the speeds in the table are measured on ImageNet-A, which only has 200 classes. The speed would be slower on ImageNet because it has 1000 classes. The speed was measured on a single A5000 GPU.

(I just found that the table isn't clear about it and we omitted it in our last revision. Thank you for bringing this up!)