enonic / xp

Enonic XP
https://enonic.com
GNU General Public License v3.0
202 stars 34 forks source link

Missing mappings for stemmed languages #7052

Closed sigdestad closed 5 years ago

sigdestad commented 5 years ago

During documentation, the following languages seems to be missing from the code:

Add to standard repo mappings before release of XP7

GlennRicaud commented 5 years ago

So after review here are the remarks:

Actions:

Final ISO 639-1 code to analyzer mapping: https://www.elastic.co/guide/en/elasticsearch/reference/2.4/analysis-lang-analyzer.html

ar -> arabic
hy -> armenian
eu -> basque
bn -> bengali
pt-BR ->  brazilian   // This is not working. Brazilian words will be put as portuguese "pt". Left it in the mappings and settings in case we decide to refactor in the future.
bg -> bulgarian
ca -> catalan
zh -> cjk //3 missing CJK languages
ja -> cjk
ko -> cjk
cs -> czech
da -> danish
nl -> dutch
en -> english
fi -> finnish
fr -> french
gl -> galician
de -> german
el -> greek
hi -> hindi
hu -> hungarian
id -> indonesian
ga -> irish
it -> italian
lv -> latvian
lt -> lithuanian
no -> norwegian
  nb -> norwegian
  nn -> //Custom analyzer based on language "light_nynorsk"
fa -> persian
pt -> portuguese
ro -> romanian
ru -> russian
ku -> sorani   //Sorani is not exactly kurdisk. More of a subset
es -> spanish
sv -> swedish
tr -> turkish
th -> thai