openfoodfacts / openfoodfacts-ai

This is a tracking repo for all our AI projects. 🍕 🤖🍼
221 stars 52 forks source link

Build an image classifier to detect image type (front, nutrition, ingredient, packaging) #214

Open raphael0202 opened 1 year ago

raphael0202 commented 1 year ago

Rationale

In Open Food Facts, users must select a front, nutrition, ingredients, and packaging image for each product and for each supported language. This task is quite time-consuming, especially for multi-lingual products. Besides, images are often only selected for the language of the contributor and not for other supported languages. We would like to build an image classifier to automatically select front, nutrition, ingredients, and packaging images. An other label should also be supported (i.e. any image that is neither of these). The classifier should be robust to non-food images (and classify them as other), as we receive some spam images regularly.

Detected labels:

Steps

  1. Build a dataset of selected images (image + label). Do not assume that the dataset is clean, there is a non-negligeable fraction of products where selected images are incorrect. You may use some heuristics to try to find them (like image OCRs we store with each image) and to clean the dataset. Split the dataset into train/test/val (80/10/10 is a good default split). We can then publish the cleaned dataset in Open Food Facts AI releases for reuse by other contributors/researchers. We have resized versions of images available (ex: https://world-fr.openfoodfacts.org/images/products/871/132/737/4171/8.400.jpg), it may be a good idea to use those (to keep the dataset size reasonable) as computer vision models always limit the resolution of input images. Be gentle with Open Food Facts servers, limit the number of parallel image downloads and notify us on Slack when you start bulk downloading images.
  2. Build a classifier. If you're unsure of which models to try, here is a list that was used for another project that you may find relevant: https://openfoodfacts.github.io/robotoff/research/logo-detection/benchmark/. You can use either Pytorch or Tensorflow to train the model. Timm is a good library for computer vision R&D. Text extracted using OCR may be a good feature to add to improve model performance.
  3. If the model works well, we can integrate it in production to classify images automatically.

Resources

hrabkin commented 8 months ago

@raphael0202 how to get access to all images at once without writing script to generate image paths from code??

raphael0202 commented 8 months ago

You should download images from AWS directly: https://openfoodfacts.github.io/openfoodfacts-server/api/aws-images-dataset/