Open test2a opened 8 months ago
@test2a In theory it should already be working reasonably well with digitally rendered text, and from my experience actually works quite good on my device for that task. Since the used CLIP model is trained on a bunch of image/text pairs scraped from the internet, it has some semantic OCR representation capabilities already and responds to the presence of text pretty well. Obviously the separate OCR model would be even better, so I'll probably consider adding that if I decide to implement it in the future.
Hi, clip works well for short text snippets but fails for different languages or longer text. I think it might be a good idea to introduce multiple "sources" for similarity like clip, ocr and potentially others. Might be good to keep in mind for the future
Hello @slavabarkov
I was directed to your app when I reported that my Samsung Gallery app was no longer performing text searches on images taken and/or stored on the phone.
My primary use case is searching for text, and I had some feedback for you.
That's all my feedback for now.
Thanks for all your hard work!
Memes and photos have text overlays. Usually the file name is not enough to find the right photos
Would it be possible to recognize text and index that?