openfoodfacts / openfoodfacts-server

Open Food Facts database, API server and web interface - 🐪🦋 Perl, CSS and JS coders welcome 😊 For helping in Python, see Robotoff or taxonomy-editor
http://openfoodfacts.github.io/openfoodfacts-server/
GNU Affero General Public License v3.0
655 stars 384 forks source link

Some "Beverages" are categorized as "Non-sugared beverages" automatically (Russian name of the "sugar" ingredient is not used) #398

Open aleksejrs opened 8 years ago

aleksejrs commented 8 years ago

What

TaciteOFF commented 8 years ago

This is because the Sugared category auto adds itself when it find Sugar (or sugar related) ingredient. But I guess we didn't code that for Russian language yet.

aleksejrs commented 8 years ago

And so what? I am talking about "Non-sugared".

aleksejrs commented 8 years ago

The "Non-sugared beverages" section from about the time it was imported http://en.wiki.openfoodfacts.org/index.php?title=Global_categories_taxonomy&oldid=4987

<en:Beverages en:Non-sugared beverages, beverages without added sugar es:Bebidas no azucaradas de:Ungezuckerte Getränke fr:Boissons non sucrées, boissons sans sucre ajouté nl:Ongesuikerde dranken pnns_group_2:en:Non-sugared beverages

TaciteOFF commented 8 years ago

There are two different things.

Our problem is that the detection doesn't apply to russian language. We need to add "сахар" into the algorithm.

Is it more clear now?

aleksejrs commented 8 years ago

Thanks, it is. I missed the word "ingredient" in your first message and thought it was about a category.

hangy commented 8 years ago

Right now, ingredients are not taxonomized, so that the software does not know that "en:sugar" equals "fr:sucre" and also equals "ru:сахар". The current sugar detection looks to be hard-coded to some (mostly French, partly English) ingredients. In order to make this work for all/most languages, it would be best to first tackle #224, and then change Food::special_process_product to use the ingredients taxonomy.