Closed dongnizh closed 9 years ago
thanks for filing this @dongnizh ! how are you doing in Seattle?
Hi, @chrismattmann Hope you are doing all good. Today is my first day at work.It is kind of different from what I did at school and I still need much more time to adapt to the new environment. ^_^
hang in there and keep me posted @dongnizh
@chrismattmann Of course!!
@dongnizh are we still seeing this error?
More info on this, someone reported it to the Python httplib: https://bugs.python.org/issue23054 This is still open as of December 2014.
overall this issue has to do with a PUT request being made in Windows. Since Tika Server uses PUT requests like everywhere this is causing the issue, only on Windows.
See my comment on: http://bugs.python.org/issue23054
Hi, @chrismattmann will look into this.
Thanks @dongnizh !
Hey @dongnizh I think this error has to do with the fact that you have a bogus tika-server.jar in your temp folder. And for whatever reason on Windows it doesn't seem to remove the C:\Users
See my fix for #54 and #56 @dongnizh let me know if that fixes this. We could probably add some more robust code here in #44 to verify the downloaded tika-server jar against its sha1. For example, we could check in getRemoteFile if there is a corresponding .md5 file for the URL, if so, we could then test if the download was successful. For example, https://repo1.maven.org/maven2/org/apache/tika/tika-server/1.9/tika-server-1.9.jar.md5 exists, and we could then test it.
@chrismattmann Thanks for your update. Will try the new version on Windows and update the result later.
Btw I implemented the md5 check
@chrismattmann When I am running tika-python by "parsing a file" on Windows (actually a virtual windows), it shows like: However, when you run cmd like "python tika.py config mime-types", it is working. This is one link I found so far on this problem: https://github.com/kennethreitz/requests/issues/2364
Please have a look.