openfoodfacts / openfoodfacts-server

Open Food Facts database, API server and web interface - 🐪🦋 Perl, CSS and JS coders welcome 😊 For helping in Python, see Robotoff or taxonomy-editor
http://openfoodfacts.github.io/openfoodfacts-server/
GNU Affero General Public License v3.0
658 stars 387 forks source link

Vitamin detection #3808

Open aleene opened 4 years ago

aleene commented 4 years ago

What

I found a list of products with unknown ingredient E. It seems the regexp for this vitamin needs a bit of finetuning.

Steps to Reproduce

See https://be.openfoodfacts.org/ingrediënt/fr:e

Part of

AcuarioCat commented 4 years ago

It goes a bit deeper that that and seems to affect vitamins in different languages in different ways. For example, the first in the list was actually an English ingredients list with the language set to French (code 3551100752117) The vitamins were almost all unknown. Changing the language to English and all except B1 and B6 were recognised.

In Spanish there is also a problem that vitamins in brackets aren't recognised: https://es.openfoodfacts.org/cgi/product.pl?type=edit&code=5053827148733 This (as one of many examples) does not correctly parse B1 and B2 vitamins

aleene commented 4 years ago

I guess we need to gather more examples, so that we can create and test a better parser.

AcuarioCat commented 4 years ago

Here is a product with a whole load of unknown ingredients: https://es.openfoodfacts.org/cgi/test_ingredients_analysis.pl?ingredients_text=Harina+hidrolizada+de+8+cereales+%2897+%25%29%2C+%28trigo%2C+maiz%2C+arroz%2C+avena%2C+cebada%2C+centeno%2C+sorgo+y+mijo%29%2C+galleta+%282%25%29+%28contiene+harina+de+trigo%29%2C+minerales+%28calcio+y+hierro%29%2C+aroma+y+vitaminas+%28C%2C+niacina%2C+E%2C+%C3%A1c.+pantot%C3%A9nico%2C+B2%2C+B6%2C+B1%2C+A%2C+%C3%A1c.+f%C3%B3lico%2C+biotina%2C+D+y+B12%29.+&action=process

stephanegigandet commented 4 years ago

@AcuarioCat : that last one is easy to solve by putting a couple of synonyms for "ác." and "ácido". -> Harina hidrolizada de 8 cereales (97 %), (trigo, maiz, arroz, avena, cebada, centeno, sorgo y mijo), galleta (2%) (contiene harina de trigo), minerales (calcio y hierro), aroma y vitaminas (C, niacina, E, ácido pantoténico, B2, B6, B1, A, ácido fólico, biotina, D y B12).

aleene commented 4 years ago

Some PP vitamins that are missed: https://be.openfoodfacts.org/ingrediënt/fr:pp

aleene commented 4 years ago

E vitamin: https://be.openfoodfacts.org/ingrediënt/en:e

github-actions[bot] commented 8 months ago

This issue has been open 90 days with no activity. Can you give it a little love by linking it to a parent issue, adding relevant labels and projets, creating a mockup if applicable, adding code pointers from https://github.com/openfoodfacts/openfoodfacts-server/blob/main/.github/labeler.yml, giving it a priority, editing the original issue to have a more comprehensive description… Thank you very much for your contribution to 🍊 Open Food Facts