openfoodfacts / openfoodfacts-server

Open Food Facts database, API server and web interface - 🐪🦋 Perl, CSS and JS coders welcome 😊 For helping in Python, see Robotoff or taxonomy-editor
http://openfoodfacts.github.io/openfoodfacts-server/
GNU Affero General Public License v3.0
656 stars 386 forks source link

How to handle ingredients in a parenthesized list #3617

Open AcuarioCat opened 4 years ago

AcuarioCat commented 4 years ago

There are many occurrences of ingredients grouped into a parenthesized list, for example (Spanish) acietes vegetales (girasol, oliva)

These are not correctly handled and give the individual ingredients rather than the adjective result.

This also applies to implied adjectives for example: bicarbonatos sodico y amonico

Part of

aleene commented 4 years ago

I think this was solved for other languages. @stephanegigandet is this a language dependent issue?

stephanegigandet commented 4 years ago

There is some code to do it for French in Ingredients.pm, it needs to be made more generic so that we can handle other languages. Spanish is close enough to French so it should be reasonnably easy.

e.g. for French we have lists of prefixes and suffixes like that:

    my @prefixes_suffixes_list = (

huiles

[[ "huile", "huile végétale", "huiles végétales", "matière grasse", "matières grasses", "matière grasse végétale", "matières grasses végétales", "graisse", "graisse végétale", "graisses végétales", ], [ "arachide", "avocat", "chanvre", "coco", "colza", "illipe", "karité",

And "huiles (colza, tournesol et olive)" becomes "huile de colza, huile de tournesol, huile d'olive".

aleene commented 4 years ago

Sounds like a role for a taxonomy. You do not want to do that in code for every language.

AcuarioCat commented 4 years ago

Another instance I found is the following structure: aceite de maravilla/ girasol alto oleico

I'm adding aciete de maravilla (South American word for sunflower) Occurs here: 7802000008412

AcuarioCat commented 4 years ago

It seems this also occurs for Portuguese, code 8480017087263: trigo e arroz extrusado

github-actions[bot] commented 8 months ago

This issue has been open 90 days with no activity. Can you give it a little love by linking it to a parent issue, adding relevant labels and projets, creating a mockup if applicable, adding code pointers from https://github.com/openfoodfacts/openfoodfacts-server/blob/main/.github/labeler.yml, giving it a priority, editing the original issue to have a more comprehensive description… Thank you very much for your contribution to 🍊 Open Food Facts