[analyzer idea] find and tag events that have non ascii characters/words

In some cases it can be beneficial to find and tag events that contain characters/words of non ascii languages. (e.g. Japanese, Chinese, Farsi, Hindi, etc.)

Either this could be solved as a general analyzer that tags all events that have a certain percentage of non ascii characters or it could be limited to a selection of specific languages and their unicode range.

Future ideas: Maybe it is also possible to add a second analyzer that can then translate those tagged events using a translation api or lib?

google / timesketch

[analyzer idea] find and tag events that have non ascii characters/words #2831