jolicode / emoji-search

:smile: Emoji synonyms to build your own emoji-capable search engine (elasticsearch, solr, OpenSearch)
https://jolicode.com/blog/elasticsearch-icu-now-understands-emoji
MIT License
218 stars 64 forks source link

Upgrade to CLDR 38.1, remove the hardcoded list, fix flags #34

Closed damienalexandre closed 3 years ago

damienalexandre commented 3 years ago

Change the way I remove "ignored chars" because as the changelog says:

Added approximately 400 non-emoji Unicode symbols such as punctuation and currency symbols.

The CLDR 38 annotation now contains a lot more stuffs we don't want or cannot tokenize.

The new way is 100% dynamic and needs an Elasticsearch node running.

Tests are also not running on the OSS version anymore, because new versions are not shipped anymore.

Also fix some flags issues (Scotland :wave:).