tballison / tika-gui-v2

Unofficial user interface for Apache Tika
Apache License 2.0
7 stars 0 forks source link

Set LC_ALL on unix systems #86

Closed tballison closed 1 year ago

tballison commented 1 year ago

On Linux systems with LC_CTYPE not set to something unicode friendly, e.g. C instead of C.UTF-8, the FileSystemPipesIterator fails. We can fix this by setting this in the environment of the spawned processes. I've tested that this setting is actually inherited by the AsyncProcessorCLI and its spawned processes.

tballison commented 1 year ago

Symptom was this in logs/client-*.log:


java.nio.file.InvalidPathException: Malformed input or input contains unmappable characters: ??????????????????????????????.txt```