Closed missinglink closed 4 years ago
I would like to change the search_analyzer
for the address_parts.*
and parent.*
fields but we can do that in another PR, this one is meant to be as much as a no-op as possible.
I believe the existing search_analyzer
for address_parts.street
is doing more work at query-time than required and likely returning more hits than we need.
ugh ok, I force-pushed an update to fix the failing integration tests. the keyword datatype uses a 'normalizer' in place of an 'analyzer' so I've added one of those, as a result, the ID fields and source/layer fields are now case-insensitive when they were previously case-sensitive.
this PR is partially failing on ES5 due to the trim
filter not being supported in a normalizer until Lucene 7.3 (Elastic v6.4)
This PR ended up growing quite large and complex with many different changes all in one go. I'm going to split this up into smaller PRs so that things can be reviewed and merged individually.
This is something I've been wanting to do for a while, it explicitly sets the
analyzer
andsearch_analyzer
property for all stringy fields.I've included a tool with this PR which can be used to list all the fields and their corresponding analyzers.
The default behaviour of elasticsearch is to default all analyzers to
standard
when not defined and to defaultsearch_analyzer
to equalanalyzer
when it's defined.So while it's not totally necessary to define an explicit
search_analyzer
when it's equal to theanalyzer
I have made this mandatory and covered it with tests, this ensures that it is considered when adding new fields or adapting existing ones.I set out to try and do this as a no-op refactor (basing the
search_analyzer
settings on what's in thedefaults
forpelias/api
), however, I discovered fairly quickly that the existing config is sub-optimal and so I've made the following changes: