axa-group / Parsr

Transforms PDF, Documents and Images into Enriched Structured Data
Apache License 2.0
5.72k stars 304 forks source link

Followed Instructions, but ended up with Connection Error #663

Closed maxloosmu closed 1 year ago

maxloosmu commented 1 year ago

Summary There are only 3 files in my directory for testing Parsr:

Steps To Reproduce Steps to reproduce the behavior:

This is the code in test.py:

import os from parsr_client import ParsrClient parsr = ParsrClient('http://localhost:3001') input_file = 'sample.pdf'

parsr.send_document( file_path=input_file, config_path='defaultConfig.json', document_name='The Readme', save_request_id=True)

text = parsr.get_text() output_file = 'sample.txt' with open(output_file, 'w') as f: f.write(text) print(f"Text extracted and saved to {output_file}")

Expected behavior sample.txt should be created in the root directory.

Actual behavior In one terminal window, docker is running: Starting par.sr API : node api/server/dist/index.js [2023-04-13T05:57:25] INFO (parsr-api/7 on 6937a9f88c7e): Api listening on port 3001!

In another terminal window, I get this error:

Traceback (most recent call last): File "/usr/local/lib/python3.10/site-packages/urllib3/connection.py", line 174, in _new_conn conn = connection.create_connection( File "/usr/local/lib/python3.10/site-packages/urllib3/util/connection.py", line 72, in create_connection for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM): File "/usr/local/Cellar/python@3.10/3.10.8/Frameworks/Python.framework/Versions/3.10/lib/python3.10/socket.py", line 955, in getaddrinfo for res in _socket.getaddrinfo(host, port, family, type, proto, flags): socket.gaierror: [Errno 8] nodename nor servname provided, or not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 703, in urlopen httplib_response = self._make_request( File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 398, in _make_request conn.request(method, url, **httplib_request_kw) File "/usr/local/lib/python3.10/site-packages/urllib3/connection.py", line 244, in request super(HTTPConnection, self).request(method, url, body=body, headers=headers) File "/usr/local/Cellar/python@3.10/3.10.8/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1282, in request self._send_request(method, url, body, headers, encode_chunked) File "/usr/local/Cellar/python@3.10/3.10.8/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1328, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/usr/local/Cellar/python@3.10/3.10.8/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1277, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/usr/local/Cellar/python@3.10/3.10.8/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1037, in _send_output self.send(msg) File "/usr/local/Cellar/python@3.10/3.10.8/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 975, in send self.connect() File "/usr/local/lib/python3.10/site-packages/urllib3/connection.py", line 205, in connect conn = self._new_conn() File "/usr/local/lib/python3.10/site-packages/urllib3/connection.py", line 186, in _new_conn raise NewConnectionError( urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x11d5dd9c0>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/local/lib/python3.10/site-packages/requests/adapters.py", line 489, in send resp = conn.urlopen( File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 787, in urlopen retries = retries.increment( File "/usr/local/lib/python3.10/site-packages/urllib3/util/retry.py", line 592, in increment raise MaxRetryError(_pool, url, error or ResponseError(cause)) urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='http', port=80): Max retries exceeded with url: //localhost:3001/api/v1/document (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x11d5dd9c0>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/Users/maxloo/usecases/vue/parsr/test.py", line 12, in document_id = parsr.send_document(input_file_path, default_config_path) File "/usr/local/lib/python3.10/site-packages/parsr_client/parsr_client.py", line 111, in send_document r = post( File "/usr/local/lib/python3.10/site-packages/requests/api.py", line 115, in post return request("post", url, data=data, json=json, kwargs) File "/usr/local/lib/python3.10/site-packages/requests/api.py", line 59, in request return session.request(method=method, url=url, kwargs) File "/usr/local/lib/python3.10/site-packages/requests/sessions.py", line 587, in request resp = self.send(prep, send_kwargs) File "/usr/local/lib/python3.10/site-packages/requests/sessions.py", line 701, in send r = adapter.send(request, kwargs) File "/usr/local/lib/python3.10/site-packages/requests/adapters.py", line 565, in send raise ConnectionError(e, request=request) requests.exceptions.ConnectionError: HTTPConnectionPool(host='http', port=80): Max retries exceeded with url: //localhost:3001/api/v1/document (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x11d5dd9c0>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'))

Screenshots If applicable, add screenshots to help explain your problem.

Environment

Additional context Add any other context about the problem here.

maxloosmu commented 1 year ago

Besides using a Mac, I've also just tried running Parsr using Windows 10 WSL2 with Docker, but the results are the same. I even tried changing parsr = ParsrClient('localhost:3001') to parsr = ParsrClient('localhost:80').

Screenshot - 14_4_2023 , 1_22_43 AM Screenshot - 14_4_2023 , 1_17_10 AM Screenshot - 14_4_2023 , 1_17_23 AM

maxloosmu commented 1 year ago

it's ok, I managed to get it to work using: parsr = ParsrClient('localhost:3001/') However, there's another error, which I'll post in another thread if I can't resolve it.