Closed sunweiconfidence closed 4 years ago
it supports full unicode/UTF-8 and can extract text in multiple languages see this tutorial though it's specific to Java, note it's exposed by Tika Server, and Tika Python just inherits it.
thanks @chrismattmann
hi @chrismattmann
i have a question that How many languages tika-python supports to parse attachments? because i haven't tested all languages for parsing file that contain many languages' content before, thanks.