Closed sirigg98 closed 1 year ago
@sirigg98 if I'm not mistaken you're running Grobid on windows. Please be advised that Windows is not supported. See here. That could be the cause of the error BAD_INPUT_DATA.
As stated in the documentation the best soution is to run grobid via Docker and use the python client from your windows computer.
Hey @lfoppiano,
I'm using the docker image. The command line code is:
docker pull lfoppiano/grobid:0.7.0
docker run -t --rm --init lfoppiano/grobid:0.7.0
However, I run into the ConnectionError exception outlined above when using the python client. Any suggestions?
Hi @sirigg98, when you run the docker image you also have to map the port correctly, using something like: -p 8070:8070
See the command here.
Thanks a ton @lfoppiano! This seems to have done the trick. Closing this issue now.
Hi!
I am trying to use the processHeaderDocument service (python 3.9.7, windows10), using the following curl request (running a test):
curl -v --form input=C:\Users\Downloads\test\test.pdf localhost:8070/api/processHeaderDocument
and I keep getting the following message: Trying 127.0.0.1:8070... Connected to localhost (127.0.0.1) port 8070 (#0) POST /api/processHeaderDocument HTTP/1.1 Host: localhost:8070 User-Agent: curl/7.83.1 Accept: / Content-Length: 182 Content-Type: multipart/form-data; boundary=------------------------cdbb7a56da07d8aa
We are completely uploaded and fine Mark bundle as not supporting multiuse HTTP/1.1 500 Internal Server Error Date: Tue, 16 Aug 2022 15:34:00 GMT Content-Type: application/xml Content-Length: 64
*[BAD_INPUT_DATA] PDF to XML conversion failed with error code: 1 Connection #0 to host localhost left intact**
Using the python client, I am running:
client = GrobidClient(config_path="D:\git repo\grobid_client_python\config.json")
GROBID server is up and runningclient.process_pdf("processHeaderDocument", r"C:\Users\F0064WK\Downloads\test\eur_franses_AE73 (1).pdf", consolidate_header= False, generateIDs = False, consolidate_citations = False, include_raw_citations = False, include_raw_affiliations = False, tei_coordinates = False, segment_sentences = False)
Traceback (most recent call last): ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))Could I get some clarity on this, please? Thanks for the help-- and the great service!