In some cases it can be beneficial to find and tag events that contain characters/words of non ascii languages. (e.g. Japanese, Chinese, Farsi, Hindi, etc.)
Either this could be solved as a general analyzer that tags all events that have a certain percentage of non ascii characters or it could be limited to a selection of specific languages and their unicode range.
Future ideas:
Maybe it is also possible to add a second analyzer that can then translate those tagged events using a translation api or lib?
In some cases it can be beneficial to find and tag events that contain characters/words of non ascii languages. (e.g. Japanese, Chinese, Farsi, Hindi, etc.)
Either this could be solved as a general analyzer that tags all events that have a certain percentage of non ascii characters or it could be limited to a selection of specific languages and their unicode range.
Future ideas: Maybe it is also possible to add a second analyzer that can then translate those tagged events using a translation api or lib?