openfoodfacts / openfoodfacts-server

Open Food Facts database, API server and web interface - 🐪🦋 Perl, CSS and JS coders welcome 😊 For helping in Python, see Robotoff or taxonomy-editor
http://openfoodfacts.github.io/openfoodfacts-server/
GNU Affero General Public License v3.0
655 stars 384 forks source link

ingredients hierarchy mismatch with ingredientsText on api/v2 (sometimes) #6327

Open monsieurtanuki opened 2 years ago

monsieurtanuki commented 2 years ago

Describe the bug

In some cases (not all), the hierarchy is not the same if you compare a product's ingredients and ingredients_text fields.

To Reproduce

  1. Go to https://world.openfoodfacts.org/api/v2/product/5701184005007/?lc=de&fields=ingredients%2Cingredients_text
  2. As a result for ingredients_text you get something like "Buttergebäck (_Weizenmehl_, Zucker, _Butter_ 26%, Speisesalz, Backtriebmittel (Ammouniumhydrogencarbonat), Invertzuckersirup, natürliches Aroma),\r\nSchokolade Mürbegebäck (_Weizenmehl_, Pflanzenfett (Palm), Zucker, Schokoladenstückchen 10% (Zucker, Kakaomasse, Kakaobutter, Emulgator (Lecithin)), Backtriebmittel (Ammouniumhydrogencarbonat), fettarmes Kakaopulver, Speisesalz)"
  3. As a result for ingredients you get a hierarchy of ingredients

Expected behavior

The hierarchy is broken inside ingredients for "Schokolade Mürbegebäck" - according to ingredients_text it should be the "father" of all the remaining ingredients, but it is not:

        {
            "id": "de:Schokolade Mürbegebäck",
            "percent_estimate": 25,
            "text": "Schokolade Mürbegebäck"
        },

where we expect more something like that

        {
            "id": "de:Schokolade Mürbegebäck",
            "percent_estimate": 25,
            "text": "Schokolade Mürbegebäck",
                        "ingredients":[ ... ]
        }

Which one is correct, ingredients or ingredients_text? Hard to tell because the ingredients picture is hard to read. But probably ingredients_text is correct as it matches what is displayed on the website in https://world.openfoodfacts.org/product/5701184005007/butter-cookies-danish-kelsin

Anyway, we should get the same hierarchy in both fields, and it is not the case.

Screenshots

No response

Additional context

No response

Type of device

REST-API

Browser version

No response

Number of products impacted

No response

Time per product

No response

stephanegigandet commented 2 years ago

Thanks for the report. There's a limit on the levels of nesting, it seems this one hits it. To reproduce / analyze: https://de.openfoodfacts.org/cgi/test_ingredients_analysis.pl?ingredients_text=Schokolade+M%C3%BCrbegeb%C3%A4ck+%28_Weizenmehl_%2C+Pflanzenfett+%28Palm%29%29&type=add&action=process&submit=Envoyer

github-actions[bot] commented 9 months ago

This issue has been open 90 days with no activity. Can you give it a little love by linking it to a parent issue, adding relevant labels and projets, creating a mockup if applicable, adding code pointers from https://github.com/openfoodfacts/openfoodfacts-server/blob/main/.github/labeler.yml, giving it a priority, editing the original issue to have a more comprehensive description… Thank you very much for your contribution to 🍊 Open Food Facts