juledwar / soufi

Source finder CLI and API
Apache License 2.0
0 stars 0 forks source link

Yum finder is not catching errors properly #21

Closed juledwar closed 2 years ago

juledwar commented 2 years ago

The yum finder forks off a process to limit the memory leakage from repomd. However the get_repomd function only catches HTTPError, which means the process can die with a Traceback and the behaviour of the other side of the process pipe assumes 100% success.

222 def get_repomd(queue, url):
223     if not url.endswith('/'):
224         url += '/'
225     try:
226         repo = repomd.load(url)
227     except urllib.error.HTTPError:
228         queue.put((None, None))
229         return

This should catch Exception instead, to catch everything.

juledwar commented 2 years ago

Looking at it some more, it needs to communicate failure states back to the other side. If there's no response, it just assumes there's no package. It needs to raise an explicit failure.

juledwar commented 2 years ago

I can get an error every time with: soufi centos cracklib-dicts 2.9.0-11.el7 which usually results in the mirror server telling me to go forth and multiply: requests.exceptions.ConnectionError: HTTPConnectionPool(host='mirror.centos.org', port=80): Max retries exceeded with url: /centos/8.1.1911/os/x86_64/os/repodata/ (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f71cd105220>: Failed to establish a new connection: [Errno 111] Connection refused')) This took around 6 minutes to get to this point from issuing the command.

juledwar commented 2 years ago

Anyway, there's a second problem highlighted here. Even with the default caching that the CLI provides, it's not enough to avoid the wrath of the server bots. I think @0xDEC0DE you might need to get rid of the HEAD stuff and just attempt a straight fetch every time to reduce the querying.

0xDEC0DE commented 2 years ago

Addessed in PR #22