Open wildegnux opened 10 months ago
Same here i.e. ValueError: 'api.pwnedpasswords.com' does not appear to be an IPv4 or IPv6 address
(funny coincidence that I just got the same error as this issue created a few hours ago..!)
Thanks @threatpatrols for creating this tool! Cheers
Thanks for the report(s) @gregsadetsky and @wildegnux
I've not had a chance to look into this yet, however I've been able to compare Python 3.10 vs 3.11 and somewhere in between something has broken :(
My current limited check is that I've just used hibp-downloader
under Python 3.10 to update my own local --data-path
to see if the latest reported Naz.API dataset is included yet - the tool works fine under 3.10
Not entirely sure the root cause is a DNS lookup issue yet, as I get slightly different messages in my 3.11 test environment
2024-01-19T01:17:44+1000 | INFO | hibp-downloader | HIBP Downloader: v0.2.1
2024-01-19T01:17:44+1000 | INFO | hibp-downloader | data-path '/home/ndejong/test/data'
2024-01-19T01:17:44+1000 | INFO | hibp-downloader | metadata-path '/home/ndejong/test/data'
2024-01-19T01:17:45+1000 | INFO | hibp-downloader | Created 4 worker processes to consume a queue of prefix-hash values.
2024-01-19T01:17:52+1000 | INFO | hibp-downloader | prefix=000ef source=[lc:0 et:0 rc:249 ro:0 xx:1] processed=[4.4MB ~24824H/s] api=[55req/s 4.4MB] runtime=0.2min
2024-01-19T01:18:01+1000 | INFO | hibp-downloader | prefix=001f3 source=[lc:0 et:0 rc:499 ro:0 xx:1] processed=[8.8MB ~26053H/s] api=[58req/s 8.8MB] runtime=0.3min
Process Process-4:
Traceback (most recent call last):
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/anyio/streams/tls.py", line 140, in _call_sslobject_method
result = func(*args)
^^^^^^^^^^^
File "/usr/lib/python3.11/ssl.py", line 921, in read
v = self._sslobj.read(len)
^^^^^^^^^^^^^^^^^^^^^^
ssl.SSLWantReadError: The operation did not complete (read) (_ssl.c:2580)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/anyio/_core/_tasks.py", line 115, in fail_after
yield cancel_scope
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/httpcore/_backends/anyio.py", line 34, in read
return await self._stream.receive(max_bytes=max_bytes)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/anyio/streams/tls.py", line 205, in receive
data = await self._call_sslobject_method(self._ssl_object.read, max_bytes)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/anyio/streams/tls.py", line 147, in _call_sslobject_method
data = await self.transport_stream.receive()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 1123, in receive
await self._protocol.read_event.wait()
File "/usr/lib/python3.11/asyncio/locks.py", line 213, in wait
await fut
asyncio.exceptions.CancelledError: Cancelled by cancel scope 7905d01b4790
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/httpcore/_exceptions.py", line 10, in map_exceptions
yield
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/httpcore/_backends/anyio.py", line 32, in read
with anyio.fail_after(timeout):
File "/usr/lib/python3.11/contextlib.py", line 158, in __exit__
self.gen.throw(typ, value, traceback)
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/anyio/_core/_tasks.py", line 118, in fail_after
raise TimeoutError
TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/httpx/_transports/default.py", line 66, in map_httpcore_exceptions
yield
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/httpx/_transports/default.py", line 366, in handle_async_request
resp = await self._pool.handle_async_request(req)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/httpcore/_async/connection_pool.py", line 268, in handle_async_request
raise exc
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/httpcore/_async/connection_pool.py", line 251, in handle_async_request
response = await connection.handle_async_request(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/httpcore/_async/connection.py", line 103, in handle_async_request
return await self._connection.handle_async_request(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/httpcore/_async/http2.py", line 185, in handle_async_request
raise exc
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/httpcore/_async/http2.py", line 148, in handle_async_request
status, headers = await self._receive_response(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/httpcore/_async/http2.py", line 292, in _receive_response
event = await self._receive_stream_event(request, stream_id)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/httpcore/_async/http2.py", line 333, in _receive_stream_event
await self._receive_events(request, stream_id)
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/httpcore/_async/http2.py", line 361, in _receive_events
events = await self._read_incoming_data(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/httpcore/_async/http2.py", line 452, in _read_incoming_data
raise exc
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/httpcore/_async/http2.py", line 438, in _read_incoming_data
data = await self._network_stream.read(self.READ_NUM_BYTES, timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/httpcore/_backends/anyio.py", line 31, in read
with map_exceptions(exc_map):
File "/usr/lib/python3.11/contextlib.py", line 158, in __exit__
self.gen.throw(typ, value, traceback)
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
raise to_exc(exc) from exc
httpcore.ReadTimeout
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/usr/lib/python3.11/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/hibp_downloader/commands/hibp_download.py", line 178, in queue_worker_process
asyncio.run(pwnedpasswords_get_store_gather(result_queue, hash_prefixes, worker_args))
File "/usr/lib/python3.11/asyncio/runners.py", line 190, in run
return runner.run(main)
^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/hibp_downloader/commands/hibp_download.py", line 182, in pwnedpasswords_get_store_gather
metadata_results = await asyncio.gather(
^^^^^^^^^^^^^^^^^^^^^
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/hibp_downloader/commands/hibp_download.py", line 228, in pwnedpasswords_get_and_store_async
binary, metadata_latest = await pwnedpasswords_get(prefix, hash_type=hash_type, etag=etag, encoding=encoding_type)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/hibp_downloader/commands/hibp_download.py", line 269, in pwnedpasswords_get
response = await httpx_binary_response(url=url, etag=etag, encoding=encoding, debug=httpx_debug)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/hibp_downloader/lib/http.py", line 53, in httpx_binary_response
response = await client.send(request=request, stream=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/httpx/_client.py", line 1617, in send
response = await self._send_handling_auth(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/httpx/_client.py", line 1645, in _send_handling_auth
response = await self._send_handling_redirects(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/httpx/_client.py", line 1682, in _send_handling_redirects
response = await self._send_single_request(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/httpx/_client.py", line 1719, in _send_single_request
response = await transport.handle_async_request(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/httpx/_transports/default.py", line 365, in handle_async_request
with map_httpcore_exceptions():
File "/usr/lib/python3.11/contextlib.py", line 158, in __exit__
self.gen.throw(typ, value, traceback)
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/httpx/_transports/default.py", line 83, in map_httpcore_exceptions
raise mapped_exc(message) from exc
httpx.ReadTimeout
^CSIGINT received, exiting.
SIGINT received, exiting.
SIGINT received, exiting.
SIGINT received, exiting.
^CSIGINT received, exiting.
Exception ignored in atexit callback: <function _exit_function at 0x7905d137ce00>
Traceback (most recent call last):
File "/usr/lib/python3.11/multiprocessing/util.py", line 360, in _exit_function
_run_finalizers()
File "/usr/lib/python3.11/multiprocessing/util.py", line 300, in _run_finalizers
finalizer()
File "/usr/lib/python3.11/multiprocessing/util.py", line 224, in __call__
res = self._callback(*self._args, **self._kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/multiprocessing/queues.py", line 199, in _finalize_join
thread.join()
File "/usr/lib/python3.11/threading.py", line 1119, in join
self._wait_for_tstate_lock()
File "/usr/lib/python3.11/threading.py", line 1139, in _wait_for_tstate_lock
if lock.acquire(block, timeout):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ndejong/test/.venv/lib/python3.11/site-packages/hibp_downloader/main.py", line 20, in sigint_handler
exit()
File "<frozen _sitebuiltins>", line 26, in __call__
SystemExit: None
Note to self
New release v0.3.0 that might resolve this issue, unfortunately it's been really hard to replicate the errors on this to chase it down - https://github.com/threatpatrols/hibp-downloader/releases/tag/0.3.0
There are two key changes that I'm expecting should move the needle on this issue though -
The code is more robust around handling worker threads when they have HTTPX exceptions so we should see less infinite loop scenarios, however clean thread exiting (eg SIGINT with ^C) is still not a thing :(
Please do send feedback so we can gauge if this resolves.
Occasional errors still occur on Python 3.11.6 (on EndeavourOS):
2024-01-24T14:50:16+0100 | DEBUG | hibp-downloader | worker_index=14 prefix='0413a' hash_type='sha1' encoding_type='gzip' ignore_etag=False local_cache_ttl=43200 start_timestamp=2024-01-24 14:50:16.201230+01:00
2024-01-24T14:50:16+0100 | DEBUG | hibp-downloader | No existing metadata file 'data2/sha1/04/13/0413a.meta' using an empty entry
2024-01-24T14:50:46+0100 | ERROR | hibp-downloader | Unable to establish connection https://api.pwnedpasswords.com/range/0413a
Debug log doesn't seem to tell me anything else about those threads that might be useful.
But the failing threads do exit cleanly now so there's no more infinite wait. And to workaround the errors once could just run it in a loop until it stops giving errors.
I can also reproduce the same errors on a different machine (in a different network) with Python 3.10.12 on Ubuntu 22.04.3 LTS (running under WSL)
Okay, that helps - are you able to provide a list of the installed Python packages in each virtual environment you have there?
Python 3.11.6:
aiofiles==23.2.1
anyio==4.2.0
certifi==2023.11.17
click==8.1.7
colorama==0.4.6
h11==0.14.0
h2==4.1.0
hibp-downloader==0.3.0
hpack==4.0.0
httpcore==1.0.2
httpx==0.26.0
hyperframe==6.0.1
idna==3.6
markdown-it-py==3.0.0
mdurl==0.1.2
Pygments==2.17.2
rich==13.7.0
shellingham==1.5.4
sniffio==1.3.0
typer==0.9.0
typing_extensions==4.9.0
Python 3.10.12:
aiofiles==23.2.1
anyio==4.2.0
certifi==2023.11.17
click==8.1.7
colorama==0.4.6
exceptiongroup==1.2.0
h11==0.14.0
h2==4.1.0
hibp-downloader==0.3.0
hpack==4.0.0
httpcore==1.0.2
httpx==0.26.0
hyperframe==6.0.1
idna==3.6
markdown-it-py==3.0.0
mdurl==0.1.2
Pygments==2.17.2
rich==13.7.0
shellingham==1.5.4
sniffio==1.3.0
typer==0.9.0
typing_extensions==4.9.0
I am also experiencing the Unable to establish
connection errors - and, from reading the logs, I assume they are not retried? 🤔
Yes, retries are not a thing in the current code - would surely help
Current thinking is that the timeouts would make sense if you consider this thing creates a lot connections which may in-turn cause an upstream network component to grok - for example a small home router may not have enough NAT state table and then intermittently show connection problems.
This line of thinking might explain why I've rarely been able to replicate this most of the time.
Could you give it another spin and reduce the --processes and --chunk-size values - if this is the underlaying issue then a feature to dynamically scale back to suit the local network conditions might be the thing needed here
New release 0.3.1 implements retries and provides CLI options to adjust the default values https://pypi.org/project/hibp-downloader/0.3.1/
pip install --upgrade hibp-downloader
Feedback would be great
Sorry for the lack of feedback! Will run some test now!
I still run into some threads crashing:
Traceback (most recent call last):
File "/home/dam/.virtualenvs/pwn/lib/python3.11/site-packages/anyio/streams/tls.py", line 140, in _call_sslobject_method
result = func(*args)
^^^^^^^^^^^
File "/usr/lib/python3.11/ssl.py", line 921, in read
v = self._sslobj.read(len)
^^^^^^^^^^^^^^^^^^^^^^
ssl.SSLWantReadError: The operation did not complete (read) (_ssl.c:2580)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/dam/.virtualenvs/pwn/lib/python3.11/site-packages/anyio/_core/_tasks.py", line 115, in fail_after
yield cancel_scope
File "/home/dam/.virtualenvs/pwn/lib/python3.11/site-packages/httpcore/_backends/anyio.py", line 34, in read
return await self._stream.receive(max_bytes=max_bytes)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/dam/.virtualenvs/pwn/lib/python3.11/site-packages/anyio/streams/tls.py", line 205, in receive
data = await self._call_sslobject_method(self._ssl_object.read, max_bytes)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/dam/.virtualenvs/pwn/lib/python3.11/site-packages/anyio/streams/tls.py", line 147, in _call_sslobject_method
data = await self.transport_stream.receive()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/dam/.virtualenvs/pwn/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 1123, in receive
await self._protocol.read_event.wait()
File "/usr/lib/python3.11/asyncio/locks.py", line 213, in wait
await fut
asyncio.exceptions.CancelledError: Cancelled by cancel scope 72b64bf7c550
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/dam/.virtualenvs/pwn/lib/python3.11/site-packages/httpcore/_exceptions.py", line 10, in map_exceptions
yield
File "/home/dam/.virtualenvs/pwn/lib/python3.11/site-packages/httpcore/_backends/anyio.py", line 32, in read
with anyio.fail_after(timeout):
File "/usr/lib/python3.11/contextlib.py", line 158, in __exit__
self.gen.throw(typ, value, traceback)
File "/home/dam/.virtualenvs/pwn/lib/python3.11/site-packages/anyio/_core/_tasks.py", line 118, in fail_after
raise TimeoutError
TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/dam/.virtualenvs/pwn/lib/python3.11/site-packages/httpx/_transports/default.py", line 67, in map_httpcore_exceptions
yield
File "/home/dam/.virtualenvs/pwn/lib/python3.11/site-packages/httpx/_transports/default.py", line 252, in __aiter__
async for part in self._httpcore_stream:
File "/home/dam/.virtualenvs/pwn/lib/python3.11/site-packages/httpcore/_async/connection_pool.py", line 361, in __aiter__
async for part in self._stream:
File "/home/dam/.virtualenvs/pwn/lib/python3.11/site-packages/httpcore/_async/http2.py", line 582, in __aiter__
raise exc
File "/home/dam/.virtualenvs/pwn/lib/python3.11/site-packages/httpcore/_async/http2.py", line 572, in __aiter__
async for chunk in self._connection._receive_response_body(
File "/home/dam/.virtualenvs/pwn/lib/python3.11/site-packages/httpcore/_async/http2.py", line 313, in _receive_response_body
event = await self._receive_stream_event(request, stream_id)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/dam/.virtualenvs/pwn/lib/python3.11/site-packages/httpcore/_async/http2.py", line 333, in _receive_stream_event
await self._receive_events(request, stream_id)
File "/home/dam/.virtualenvs/pwn/lib/python3.11/site-packages/httpcore/_async/http2.py", line 361, in _receive_events
events = await self._read_incoming_data(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/dam/.virtualenvs/pwn/lib/python3.11/site-packages/httpcore/_async/http2.py", line 452, in _read_incoming_data
raise exc
File "/home/dam/.virtualenvs/pwn/lib/python3.11/site-packages/httpcore/_async/http2.py", line 438, in _read_incoming_data
data = await self._network_stream.read(self.READ_NUM_BYTES, timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/dam/.virtualenvs/pwn/lib/python3.11/site-packages/httpcore/_backends/anyio.py", line 31, in read
with map_exceptions(exc_map):
File "/usr/lib/python3.11/contextlib.py", line 158, in __exit__
self.gen.throw(typ, value, traceback)
File "/home/dam/.virtualenvs/pwn/lib/python3.11/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
raise to_exc(exc) from exc
httpcore.ReadTimeout
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/usr/lib/python3.11/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/home/dam/.virtualenvs/pwn/lib/python3.11/site-packages/hibp_downloader/commands/hibp_download.py", line 193, in queue_worker_process
asyncio.run(pwnedpasswords_get_store_gather(result_queue, hash_prefixes, worker_args))
File "/usr/lib/python3.11/asyncio/runners.py", line 190, in run
return runner.run(main)
^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/asyncio/base_events.py", line 654, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
2024-03-08T14:55:04+0100 | ESC[1;37mDEBUGESC[0m | hibp-downloader | Request attempt 1 of 5 for 'https://api.pwnedpasswords.com/range/0ec91'
File "/home/dam/.virtualenvs/pwn/lib/python3.11/site-packages/hibp_downloader/commands/hibp_download.py", line 197, in pwnedpasswords_get_store_gather
metadata_results = await asyncio.gather(
^^^^^^^^^^^^^^^^^^^^^
File "/home/dam/.virtualenvs/pwn/lib/python3.11/site-packages/hibp_downloader/commands/hibp_download.py", line 254, in pwnedpasswords_get_and_store_async
binary, metadata_latest = await pwnedpasswords_get(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/dam/.virtualenvs/pwn/lib/python3.11/site-packages/hibp_downloader/commands/hibp_download.py", line 306, in pwnedpasswords_get
response = await httpx_binary_response(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/dam/.virtualenvs/pwn/lib/python3.11/site-packages/hibp_downloader/lib/http.py", line 74, in httpx_binary_response
response.binary = b"".join([part async for part in response.aiter_raw()])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/dam/.virtualenvs/pwn/lib/python3.11/site-packages/hibp_downloader/lib/http.py", line 74, in <listcomp>
response.binary = b"".join([part async for part in response.aiter_raw()])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/dam/.virtualenvs/pwn/lib/python3.11/site-packages/httpx/_models.py", line 991, in aiter_raw
async for raw_stream_bytes in self.stream:
File "/home/dam/.virtualenvs/pwn/lib/python3.11/site-packages/httpx/_client.py", line 147, in __aiter__
async for chunk in self._stream:
File "/home/dam/.virtualenvs/pwn/lib/python3.11/site-packages/httpx/_transports/default.py", line 251, in __aiter__
with map_httpcore_exceptions():
File "/usr/lib/python3.11/contextlib.py", line 158, in __exit__
self.gen.throw(typ, value, traceback)
File "/home/dam/.virtualenvs/pwn/lib/python3.11/site-packages/httpx/_transports/default.py", line 84, in map_httpcore_exceptions
raise mapped_exc(message) from exc
httpx.ReadTimeout
And now some retries (nice :+1: )
2024-03-08T14:53:30+0100 | DEBUG | hibp-downloader | Request attempt 2 of 5 for 'https://api.pwnedpasswords.com/range/05cbc'
2024-03-08T14:53:30+0100 | DEBUG | hibp-downloader | Request attempt 2 of 5 for 'https://api.pwnedpasswords.com/range/05cbf'
2024-03-08T14:53:51+0100 | DEBUG | hibp-downloader | Request attempt 2 of 5 for 'https://api.pwnedpasswords.com/range/075ea'
2024-03-08T14:54:14+0100 | DEBUG | hibp-downloader | Request attempt 2 of 5 for 'https://api.pwnedpasswords.com/range/0922a'
2024-03-08T14:55:04+0100 | DEBUG | hibp-downloader | Request attempt 2 of 5 for 'https://api.pwnedpasswords.com/range/0ccec'
2024-03-08T14:55:04+0100 | DEBUG | hibp-downloader | Request attempt 2 of 5 for 'https://api.pwnedpasswords.com/range/0cced'
2024-03-08T14:55:04+0100 | DEBUG | hibp-downloader | Request attempt 2 of 5 for 'https://api.pwnedpasswords.com/range/0ccf2'
2024-03-08T15:11:25+0100 | DEBUG | hibp-downloader | Request attempt 2 of 5 for 'https://api.pwnedpasswords.com/range/4e3ac'
2024-03-08T15:12:56+0100 | DEBUG | hibp-downloader | Request attempt 2 of 5 for 'https://api.pwnedpasswords.com/range/540c3'
2024-03-08T15:19:08+0100 | DEBUG | hibp-downloader | Request attempt 2 of 5 for 'https://api.pwnedpasswords.com/range/6bb9d'
2024-03-08T15:19:08+0100 | DEBUG | hibp-downloader | Request attempt 2 of 5 for 'https://api.pwnedpasswords.com/range/6bba1'
2024-03-08T15:39:30+0100 | DEBUG | hibp-downloader | Request attempt 2 of 5 for 'https://api.pwnedpasswords.com/range/b9611'
2024-03-08T15:39:30+0100 | DEBUG | hibp-downloader | Request attempt 2 of 5 for 'https://api.pwnedpasswords.com/range/b9615'
2024-03-08T15:40:41+0100 | DEBUG | hibp-downloader | Request attempt 2 of 5 for 'https://api.pwnedpasswords.com/range/bdb53'
2024-03-08T15:47:58+0100 | DEBUG | hibp-downloader | Request attempt 2 of 5 for 'https://api.pwnedpasswords.com/range/da102'
Unclear to me if that crashed thread is one of the retries...
That test was run on my laptop in my office (wifi) network using the default options. Running the same right now on my home network. Will also run a test on my home desktop (WSL).
Test from home network: No crashed threads, but more of these failed downloads:
2024-03-09T18:54:56+0100 | DEBUG | hibp-downloader | worker_index=6 prefix='d3215' hash_type='sha1' encoding_type='gzip' http_timeout=30 http_max_retries=5 http_debug=Falseignore_etag=False local_cache_ttl=43200 start_timestamp=2024-03-09 18:54:56.486975+01:00
2024-03-09T18:54:56+0100 | DEBUG | hibp-downloader | No existing metadata file 'test_home/sha1/d3/21/d3215.meta' using an empty entry
2024-03-09T18:54:56+0100 | DEBUG | hibp-downloader | Request attempt 1 of 5 for 'https://api.pwnedpasswords.com/range/d3215'
2024-03-09T18:55:06+0100 | WARNING | hibp-downloader | Request [1 of 5] failed for 'GET' 'https://api.pwnedpasswords.com/range/d3215'
2024-03-09T18:55:06+0100 | DEBUG | hibp-downloader | Request attempt 2 of 5 for 'https://api.pwnedpasswords.com/range/d3215'
But all failures succeeded on attempt 2. :+1:
Test in WSL (home network): With default options (which on this machine means 32 processes) it runs full boar for ~30-60 sec, then completely stops for ~30 sec (matching the http-timeout value). Then all the active download fails and it continues going at full speed again for another minute and then stops again and so on. Reducing processes / chunks size does improve things, but I had to go down to 8 processes and 1 chunk for getting it to not occasionally lock up. Might be some funky stuff in the WSL network stack I guess, since it worked mostly fine one the linux laptop in the same network.
I'm repeatedly getting the following errors running the downloader:
The thread managing also does not handle the threads crashing, making those threads stall. Eventually every thread crashes and the main process gets stuck waiting for the now crashed threads to close.