openfoodfacts / openfoodfacts-server

Open Food Facts database, API server and web interface - 🐪🦋 Perl, CSS and JS coders welcome 😊 For helping in Python, see Robotoff or taxonomy-editor
http://openfoodfacts.github.io/openfoodfacts-server/
GNU Affero General Public License v3.0
653 stars 382 forks source link

Categories - Pâtés et Pates are merged together #458

Open teolemon opened 8 years ago

teolemon commented 8 years ago

What

fr:pâtés fr:pates are amalgamated when typing the category from world.off

Part of

hangy commented 7 years ago

Let's maybe simply remove the unconditional deaccenting of tags when they are canonicalized (ProductOpener::Store::unac_string_perl or ProductOpener::Store::get_fileid)? We had other reports where this yieled similar unintended results in German.

hangy commented 6 years ago

We should probably also use Unicode::Casing to support different languages properly (ie. Turkish I problem)?

teolemon commented 6 years ago

@hangy probably. @maddingue knowledgeable about this ?

stephanegigandet commented 5 years ago

For French we should keep the unaccenting, it helps in many cases, a lot of people type "boeuf" (I have no idea how to type the oe char in fact ;-) ). There are a few conflicts where 2 words that deaccent to the same string mean 2 different things, but they are very rare. One example is pâte and pâté.

hangy commented 5 years ago

One problem is that get_fileid does not have a language/country for context. äöü shouldn't be replaced for a de locale, for sure. There's just too much potential for conflict, and noone with a German keyboard layout writes "Doener" instead of "Döner".

Unconditional unaccenting of é to e for other languages than French might still cause conflicts. I honestly don't know enough about all languages to know how ie. a native Hungarian speaker would handle that.

teolemon commented 4 years ago

We can close this one, right ?

hangy commented 4 years ago

We can close this one, right ?

Depends. https://world.openfoodfacts.org/category/fr:p%C3%A2t%C3%A9s and https://world.openfoodfacts.org/category/fr:pates both redirect to https://world.openfoodfacts.org/category/pastas, as unaccenting is intentionally enabled for French: https://github.com/openfoodfacts/openfoodfacts-server/blob/e73668733e0dbb353f4b37fd29f6ded2afc8c55e/lib/ProductOpener/Config_off.pm#L124-L127