chrismattmann / tika-python

Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
Apache License 2.0
1.49k stars 234 forks source link

classpath functionality is broken on Windows 10 #327

Closed mirrord closed 3 years ago

mirrord commented 3 years ago

The startServer function attempts to concatenate the given classpath to the tika jar path with a colon. This is appropriate for Linux, but not for Windows where the correct character is the semicolon.

the problem lies (in part) on line 639 of tika.py:

    if classpath:
        classpath += ":" + tikaServerJar

should instead be:

    if classpath:
        if Windows:
            classpath += ";" + tikaServerJar
        else:
            classpath += ":" + tikaServerJar

The second half of this issue is that including a naked semicolon causes CLI parsing errors, so the whole classpath variable must be wrapped in quotes.

chrismattmann commented 3 years ago

Fixed in https://github.com/chrismattmann/tika-python/commit/47aebcbe1c7d3f3a697d27fc64953f81eba0e4fa