Closed aguschin closed 10 months ago
I just created #50 to address the 1st option. We will separate responsibility here: test that bot works with images from reference database, and test reverse image search quality with augmented images. This ticket should address the 2nd.
The script was created by @mjason98 and jupyter notebook for a separate measurement was created by @mohammadsanaee - both are committed to the repo. Now we need to improve search quality.
Reverse image search is a crucial part of the app. To improve it, we need a way to measure the quality of the search.
As a first iteration, I suggest creating a test set like this: take some amount of images from the reference database, augment them to resemble "taking a photo in art gallery" scenario, run search and measure top-1 accuracy (whether we found the right reference or not).
Also please consider whether this should be done via telegram bot or directly by invoking the reverse image search script. First is closer to the actual metric (we deal with telegram-compressed images and we can spot our pipeline errors this way), Second is faster and easier to solve. Second option should be better since we're benchmarking the search quality and don't need to actually run telegram bot with updated algorithm, but I'm open to discussing pros and cons here.