chrismattmann / tika-python

Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
Apache License 2.0
1.51k stars 235 forks source link

Problem when using the application in a Sikuli environment. #215

Closed BorjaDiago closed 5 years ago

BorjaDiago commented 5 years ago

The issue is that when testing the application in the terminal everything works correctly, but when trying to use those same lines of code in a Sikuli environment, the server is not able to start. At first I thought it would be to have the server started from the terminal, but to close it and run again continues to fail. Attached code and error trace, I hope you can help me. Thank you very much for the great job you are doing!

Code

import sys
if "C:\\Python27\\Lib\\site-packages" not in sys.path:
    sys.path.append("C:\\Python27\\Lib\\site-packages")

Imports
from tika import parser

wait(2)
raw = parser.from_file('C:\\Users\\bdiago\\Desktop\\1CC1878_20181001_Ejecucion_BRS_4Sight-Accenture_Q4_2018.pdf')
print(raw['content']).encode('UTF-16')

Log

2019-01-10 18:11:34,737 [Thread-10 ] [WARNI] Failed to see startup log message; retrying...

2019-01-10 18:11:39,755 [Thread-10 ] [WARNI] Failed to see startup log message; retrying...

2019-01-10 18:11:44,767 [Thread-10 ] [WARNI] Failed to see startup log message; retrying...

2019-01-10 18:11:49,772 [Thread-10 ] [ERROR] Tika startup log message not received after 3 tries.
2019-01-10 18:11:49,772 [Thread-10 ] [ERROR] Failed to receive startup confirmation from startServer.

[error] script [ testPdf ] stopped with error at line --unknown--
[error] Error caused by: Traceback (most recent call last): File "C:\Sikuli\testPdf.sikuli\testPdf.py", line 10, in <module> raw = parser.from_file('C:\\Users\\bdiago\\Desktop\\1CC1878_20181001_Ejecucion_BRS_4Sight-Accenture_Q4_2018.pdf') File "C:\Python27\Lib\site-packages\tika\parser.py", line 36, in from_file jsonOutput = parse1('all', filename, serverEndpoint, headers=headers, config_path=config_path) File "C:\Python27\Lib\site-packages\tika\tika.py", line 327, in parse1 status, response = callServer('put', serverEndpoint, service, open(path, 'rb'), File "C:\Python27\Lib\site-packages\tika\tika.py", line 522, in callServer serverEndpoint = checkTikaServer(scheme, serverHost, port, tikaServerJar, classpath, config_path) File "C:\Python27\Lib\site-packages\tika\tika.py", line 580, in checkTikaServer raise RuntimeError("Unable to start Tika server.") RuntimeError: Unable to start Tika server. 

Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "C:\Sikuli\sikulix.jar\Lib\atexit.py", line 24, in _run_exitfuncs
func(*targs, **kargs)
File "C:\Sikuli\sikulix.jar\Lib\threading.py", line 297, in _MainThread__exitfunc
t.join()
File "C:\Sikuli\sikulix.jar\Lib\threading.py", line 128, in join
raise RuntimeError("cannot join current thread")
RuntimeError: cannot join current thread
Error in sys.exitfunc:
Traceback (most recent call last):
File "C:\Sikuli\sikulix.jar\Lib\atexit.py", line 24, in _run_exitfuncs
File "C:\Sikuli\sikulix.jar\Lib\threading.py", line 297, in _MainThread__exitfunc
File "C:\Sikuli\sikulix.jar\Lib\threading.py", line 128, in join
RuntimeError: cannot join current thread
chrismattmann commented 5 years ago

Does Sikuli provide Java? It seems like it wasn't able to check the log file for the message Tika server started up, and I'm wondering if it provides Java?

chrismattmann commented 5 years ago

is this related to #203 ?

BorjaDiago commented 5 years ago

Hi @chrismattmann As it says in it's documentation: "SikuliX is a Java application, that works on Windows XP+, Mac 10.6+ and most Linux/Unix systems (with 1.1.4+ only 64-Bit systems). For Windows, Mac it is complete and should normally work out of the box. For Linux/Unix systems there are a few prerequisites to be setup." The scripting languaje that i use is: Python language level 2.7 (supported by Jython).

chrismattmann commented 5 years ago

Sorry hope you got it running doesn't look like much activity here.

BorjaDiago commented 4 years ago

Yes, finally got the results that I spected. Thank you so much. Regards.