Closed mephinet closed 4 years ago
Inside the schema create or update request, the $.elasticsearch
and $.fields.{{fieldName}}.elasticsearch
properties will allow the addition of the _meshLanguageOverride
field. This field must be an object. The keys of this object must be a language used by nodes in Mesh or a comma separated list of those languages. The values must be the setting (for index settings) or the mapping of the field (for field mappings).
When creating or updating a valid schema with the _meshLanguageOverride
set, Mesh will create additional indices for each language found in these objects. During the schema migration, nodes of that language will then be put to the corresponding new index or the default index if the language of the node was not configured in the _meshLangaugeOverride
field. The default index uses the settings and mappings found directly in the $.elasticsearch
and $.fields.{{fieldName}}.elasticsearch
properties of the schema.
When searching, Mesh will query all node indices, just like before. The query will be analysed according to the index mappings, which means that the correct settings/mappings will automatically be chosen. If the user wishes to only query nodes of a specific language, the query itself must contain that constraint by querying the $.language
field.
{
"name": "page",
"elasticsearch": {
"_meshLanguageOverride": {
"de": {
"analyzer": {
"my_stop_analyzer": {
"type": "stop",
"stopwords": "_german_"
}
}
},
"jp,zh,ko": {
"analyzer": {
"my_stop_analyzer": {
"type": "stop",
"stopwords": "_cjk_"
}
}
}
},
"analyzer": {
"my_stop_analyzer": {
"type": "stop",
"stopwords": "_english_"
}
}
},
"fields": [
{
"name": "title",
"type": "string",
"elasticsearch": {
"basicsearch": {
"type": "text",
"analyzer": "my_stop_analyzer"
}
}
},
{
"name": "content",
"type": "string",
"elasticsearch": {
"_meshLanguageOverride": {
"fr": {
"basicsearch": {
"type": "text",
"analyzer": "standard"
}
}
},
"basicsearch": {
"type": "text",
"analyzer": "my_stop_analyzer"
}
}
}
]
}
This schema defines the my_stop_analyzer
. Per default, the english stop word list will be used to filter out certain words. Nodes with language de
will use a different list and nodes with the language of either zh
, jp
or ko
will use another list.
The title
field uses this analyzer, which will be different for some langauges as described above.
The content
field uses the same analyzer. However, an exception has been made for nodes with the language fr
. Here, the standard analyzer (which has no stop words) will be used instead.
Currently, when creating/updating a schema in Gentics Mesh, a schema-wide ElasticSearch configuration can be provided, containing (among other things) filters and analyzers. This configuration can then be used, and extended, for each field. While this concept works fine for single-language projects, in multi-language projects the ElasticSearch analyzer configuration is language-dependent, cf https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-lang-analyzer.html . Therefore, the field configuration needs to allow specifying one analyzer per language, plus one fallback...