Closed missinglink closed 3 years ago
note: we may want to add some non-standard synonyms such as UK,GB
I ran a quick WOF-only build with this PR, and can confirm that it's enough to make the city autocomplete test in https://github.com/pelias/acceptance-tests/pull/537 flip to passing
Just repeating and expanding on my comment in https://github.com/pelias/schema/pull/471#issuecomment-772656955 over here:
It seems like we want to keep the basic structure of this PR, where we use synonyms to handle either the 2 or 3 letter variants of country codes, with two changes:
parent.*_a
fields on the standard admin
mapping. If we need to add any tests to verify the scoring aspects of keeping field lengths and token positions enabled, lets do soOtherwise, looks like this will fix a nice case for us and shouldn't really break anything else 👍
superseded by https://github.com/pelias/schema/pull/473
branched from https://github.com/pelias/schema/pull/471 please merge that PR first, see diff
This PR:
peliasAdmin
andpeliasIndexOneEdgeGram
analyzers.This work solves the issue outlined in https://github.com/pelias/schema/pull/469, however it could come with some unwanted side-effects, so we should discuss them before merging..
Ideally these synonyms would only be applied to the
parent.country_a
field and not otherpeliasAdmin
fields such asparent.region
(for example).In order to accomplish that we would need to do a bit of refactoring, this may be preferable to avoid synonyms like
ST,STP
from this new file conflicting with regions prefixed with 'Saint', for instance.Each of the
address_parts.*
,name.*
etc fields currently have their ownanalyser
, but theparent.*
fields all share a common analyser, it may be time to give eachparent.*
field it's own analyser so they can be configured independently.One other issue I noticed when developing this is that the admin partial uses
"search_analyzer": "peliasAdmin"
when it should really use"search_analyzer": "peliasQuery"
, this means that synonyms are being applied at both query-time and search-time.For the
parent.*.ngram
fields we index with"analyzer": "peliasIndexOneEdgeGram"
, which means that the synonyms will also be added to thename.*
fields, this could be avoided if we also had individual ngram analyzers for each of these sub-fields which were different from the mainpeliasIndexOneEdgeGram
analyzer.Let's discuss on a call..