Closed Tooa closed 4 years ago
thanks @Tooa fantastic, looks like a legit issue with the doco, and/or expected behavior. Care to submit a simple PR that fixes .from_file
to work as expected? Thanks for the fantastic issue and report.
@chrismattmann Thank you. I proposed a change including a test case as PR.
Summary
Setting a
serverEndpoint
different fromlocalhost
throws an exception. See the snippet below:logicalspark/docker-tikaserver:latest
(1.23)tika-python version: 1.23
Steps to reproduce
tika-python
installeddocker-compose.yml
example:services: tika-server: image: logicalspark/docker-tikaserver:latest networks: net: ipv4_address: 10.5.0.5
networks: net: driver: bridge ipam: config:
Expected Behaviour
tika-python
returns result from dockerized Apache-Tika usingparser.from_file
Actual Behaviour
ConnectionError
is thrown2020-01-09 18:29:43,520 [MainThread ] [WARNI] config option must be one of meta, text, or all; using all.
is printedtika-python
callslocalhost
no matter whatparser.from_file('file', 'http://10.5.0.5:9998')
contains:Analysis and Details
parser.from_buffer()
works as expected:def from_file(filename, service='all', serverEndpoint=ServerEndpoint ...
.parser.from_file('file', 'http://10.5.0.5:9998')
as specified in the documentation, the variableservice
gets assigned withhttp://10.5.0.5:9998
and serverEndpoint defaults toServerEndpoint
, which islocalhost
.2020-01-09 18:29:43,520 [MainThread ] [WARNI] config option must be one of meta, text, or all; using all.
Workaround