Closed damienalexandre closed 3 years ago
Change the way I remove "ignored chars" because as the changelog says:
Added approximately 400 non-emoji Unicode symbols such as punctuation and currency symbols.
The CLDR 38 annotation now contains a lot more stuffs we don't want or cannot tokenize.
The new way is 100% dynamic and needs an Elasticsearch node running.
Tests are also not running on the OSS version anymore, because new versions are not shipped anymore.
Also fix some flags issues (Scotland :wave:).
Change the way I remove "ignored chars" because as the changelog says:
The CLDR 38 annotation now contains a lot more stuffs we don't want or cannot tokenize.
The new way is 100% dynamic and needs an Elasticsearch node running.
Tests are also not running on the OSS version anymore, because new versions are not shipped anymore.
Also fix some flags issues (Scotland :wave:).