Ingredients parsing bug "Ingredient A and Ingredient B (81%)" -> "Ingredient A (81%), Ingredient B (81%)"

CharlesNepote commented 1 year ago

Describe the bug

Sometimes fruit estimation is higher than 105 while it shouldn't be. Eg. https://world.openfoodfacts.org/cgi/product.pl?type=edit&code=3038354191904#ingredients

The json file displays 174.5 in this example:

This sometimes leads to Nutrition value over 105 - Fruits vegetables nuts estimate from ingredients data quality error.

stephanegigandet commented 1 year ago

The issue is from ingredient parsing, we turn "tomato pulp and tomato puree (72%)" into "tomato pulp (72%), tomato puree (72%)".

One solution could be to make it a composite ingredient instead "tomato pulp and tomato puree (72%) (tomato pulp, tomato puree)".

benbenben2 commented 11 months ago

The issue is from ingredient parsing, we turn "tomato pulp and tomato puree (72%)" into "tomato pulp (72%), tomato puree (72%)". One solution could be to make it a composite ingredient instead "tomato pulp and tomato puree (72%) (tomato pulp, tomato puree)".

Yes, but if your input is "tomato pulp and tomato puree (72%) (tomato 95%, water 5%)" then, your output would be something like "tomato pulp and tomato puree (72%) (tomato pulp, tomato puree) (tomato 95%, water 5%)" not sure it is good.

I would rather simply not split it in two if there is percent. (+) we will not have problem if it is compound/preparation (+) it solves percentage analysis (-) we have to add the whole compound in the taxonomy

openfoodfacts / openfoodfacts-server

Ingredients parsing bug "Ingredient A and Ingredient B (81%)" -> "Ingredient A (81%), Ingredient B (81%)" #7816

Describe the bug