google / timesketch

Collaborative forensic timeline analysis
Apache License 2.0
2.58k stars 589 forks source link

[analyzer idea] find and tag events that have non ascii characters/words #2831

Open jkppr opened 1 year ago

jkppr commented 1 year ago

In some cases it can be beneficial to find and tag events that contain characters/words of non ascii languages. (e.g. Japanese, Chinese, Farsi, Hindi, etc.)

Either this could be solved as a general analyzer that tags all events that have a certain percentage of non ascii characters or it could be limited to a selection of specific languages and their unicode range.

Future ideas: Maybe it is also possible to add a second analyzer that can then translate those tagged events using a translation api or lib?