Closed betterze closed 3 years ago
We'll always need a set of candidate expressions so that their features can be ranked w.r.t. the similarity with the input image.
Prompt engineering can be cumbersome, but you can expect a decent performance boost even with some very simple constructions.
We're preparing another notebook that has the template sentences that we used during prompt engineering for ImageNet! Hope it'll help
@jongwook Thank you for your reply. I believe this notebook will be very helpful for the community.
For info, the notebook is available:
@woctezuma Thx a lot, the notebook is very helfpful.
You are welcome. It was a nice question of yours.
I have written a small app (no GPU though) to play with the zero-shot classifier: https://dry-taiga-80279.herokuapp.com/
@woctezuma I just try it, it works very well. Maybe add this to the CLIP main page, so other people can also use it. I believe a lot of people will kile it.
Thanks!
It is just a small app, no need to add it to the repo.
I have done it out of curiosity, and because it is slightly more user-friendly than a Colab notebook.
Dear OpenAI group,
Thank you for sharing with us this great work.
Is there a way to get the most relevant words for a given image? Similar to bag of words?
For example, given a face, it may output male/female, color of hair, simile or not smile. I understand it is possible to construct sentences like 'a smiling face', but there are a number of words and different way of combination. It is not easy to create a bank of sentences like this.
Thank you very much for your help.
Best Wishes,
Alex