openfoodfacts / openfoodfacts-server

Open Food Facts database, API server and web interface - 🐪🦋 Perl, CSS and JS coders welcome 😊 For helping in Python, see Robotoff or taxonomy-editor
GNU Affero General Public License v3.0
618 stars 359 forks source link

Detection of traces and allergens in Polish ingredients lists #2226

Open stephanegigandet opened 4 years ago

stephanegigandet commented 4 years ago

We currently do not support extracting allergens and traces listed in Polish ingredients lists.

e.g. "Informacja dia alergików! Na terenie zakładu są używane: seler, soja, gorczyca, laktoza, mleko, mąka pszenna."

https://pl.openfoodfacts.org/product/5900477000853/warzywa-na-patelni%C4%99-hortex

TO DO: list the most frequent ways that traces and allergens are listed in Polish ingredients lists, so that we can improve the ingredient parser for Polish.

Slawek234 commented 4 years ago

I randomly reviewed a few products in Polish. The probability of allergens occurrence usually starts with the following text:

rare and very rare

I found warnings for sick people of this type:

Slawek234 commented 4 years ago

Allergens that I found:

for example, other warnings not related to allergies: