Closed stephen-farris-jhuapl-edu closed 4 years ago
I ran into the same issue. Thanks to @stephen-farris-jhuapl-edu's coment downgrading to 1.23 worked for me.
we have a fix for this in #280 I'll be applying it shortly. Thank you. I can push a 1.23.2 this week to release it.
Same issue here. Looking forward to a fix!
fixed in #280
I installed tika through Anaconda today and I am getting the AttributeError: module 'os' has no attribute 'setsid'.
exception. I'm on Python 3.6 on Windows 10.
Seems like 1.23.2
is not released yet. May I know when will it be released?
hi @garyng I'll try and release it this week.
Hi, Although I downgraded tika to 1.23, still the same issue occured(setsid). Any suggestion until the issue fixed?
you need to downgrade to 1.23.0 until I made the updated release.
Same here; can't upgrade nor downgrade in Anaconda Nav.
@rafaleo this has been pushed in 1.24 should be good, upgrade now
I'm using tika 1.23 successfully on Python 3.7.4 on one Windows 10 machine. However I installed tika 1.23.1 (the latest version) on another Windows 10 machine running Python 3.8.1, and I get an exception when I try to parse files. For example
tika.parser.from_file("PATH_TO_MY_PDF_FILE.pdf")
results in this exception:AttributeError: module 'os' has no attribute 'setsid'.
(NOTE: I am initializing the VM before making this call).I dug into the tika source code, and found the offending line of code in tika.py:
666: TikaServerProcess = Popen(cmd_string, stdout=logFile, stderr=STDOUT, shell=True, preexec_fn=os.setsid)
The offending line references
os.setsid
, butsetsid
does not exist in theos
module on Windows per the docs (quoted below):https://docs.python.org/3.8/library/os.html
I searched through the tika commit history on GitHub and found that this issue was introduced in this commit: https://github.com/chrismattmann/tika-python/blob/431f024d9f0862599421c27afec9076ecf29c2c3/tika/tika.py.
Prior to the aforementioned commit, the line of code in question looked like this, with no reference to
os.setsid
:665: cmd = Popen(cmd_string, stdout=logFile, stderr=STDOUT, shell=True)
Here's the diff that shows where the issue was introduced: https://github.com/chrismattmann/tika-python/commit/431f024d9f0862599421c27afec9076ecf29c2c3#diff-79bb8c4ed90a3c7e927d1091e49a6680
This issue is preventing me from using the current version of tika on Windows. I'm going to have to downgrade to version 1.23 until this is fixed.