RHSecurityCompliance / contest

Content Testing for ComplianceAsCode/content
Other
4 stars 7 forks source link

Retry osbuild depsolve several times #222

Closed comps closed 3 months ago

comps commented 3 months ago

Currently, the /hardening/image-builder tests fail a lot, and sometimes even 3x in a row (original run + 2 reruns). This is mainly due to two issues:

  1. composer-cli blueprints depsolve returning ERROR: Depsolve Error: Get "http://localhost/api/v1/blueprints/depsolve/contest_blueprint": EOF
  2. composer-cli compose start returning ERROR: ComposePushErrored: No worker for arch 'x86_64' available

The first one seems to respond to re-trying a few times, however the latter does not (tested) - it's possible that a full service restart would help there, but compose start is too late in the test and a re-try logic would basically loop the entire test, at which point we might as well let the test fail.

So this PR is not a full fix for Image Builder's instability, but it's better than nothing and eliminates approx. half the random errors.

I was considering putting retry() into lib.util, but decided not to in the end - it's too simple of a function with limited usability. If we ever need it elsewhere, perhaps with a sleep between retries, then sure.
Maybe librar-ify the timeout logic in lib.virt (waiting for ssh, etc.) and put it into the same .py.


PS: It's possible that restarting osbuild-local-worker.{socket,service} could fix it, but I have really bad memories from trying to use osbuild services in any way other than the officially documented one - starting/stopping sockets/services like this led to some corruption of the composer, as far as I remember, and then it wouldn't ever start.

comps commented 3 months ago

Updated, tested on image-builder to make sure the retry works:

2024-07-09 15:37:30 test.py:14: lib.osbuild.composer_cli:410: running: composer-cli blueprints push /tmp/tmp_erkpb7n
2024-07-09 15:37:30 test.py:14: lib.osbuild.composer_cli:410: running: composer-cli blueprints depsolve contest_blueprint
2024-07-09 15:37:39 threading.py:937: lib.util.httpsrv._BackgroundHTTPServerHandler.log_message:73: 127.0.0.1:8091: "GET /repo/repodata/repomd.xml HTTP/1.1" 200 -
2024-07-09 15:37:39 threading.py:937: lib.util.httpsrv._BackgroundHTTPServerHandler.log_message:73: 127.0.0.1:8091: "GET /repo/repodata/618e941c41fa778ae0ee2967d2f0086a47fe5074849d9788f30b4b1b1ee200b4-primary.xml.gz HTTP/1.1" 200 -
2024-07-09 15:37:39 threading.py:937: lib.util.httpsrv._BackgroundHTTPServerHandler.log_message:73: 127.0.0.1:8091: "GET /repo/repodata/274f263f29055322e86b4f480ccef1705ef0d7409a478bdf7c0f4f263b8e5837-filelists.xml.gz HTTP/1.1" 200 -
2024-07-09 15:37:39 threading.py:937: lib.util.httpsrv._BackgroundHTTPServerHandler.log_message:73: 127.0.0.1:8091: "GET /repo/repodata/repomd.xml HTTP/1.1" 200 -
2024-07-09 15:37:39 threading.py:937: lib.util.httpsrv._BackgroundHTTPServerHandler.log_message:73: 127.0.0.1:8091: "GET /repo/repodata/618e941c41fa778ae0ee2967d2f0086a47fe5074849d9788f30b4b1b1ee200b4-primary.xml.gz HTTP/1.1" 200 -
2024-07-09 15:37:39 threading.py:937: lib.util.httpsrv._BackgroundHTTPServerHandler.log_message:73: 127.0.0.1:8091: "GET /repo/repodata/274f263f29055322e86b4f480ccef1705ef0d7409a478bdf7c0f4f263b8e5837-filelists.xml.gz HTTP/1.1" 200 -
ERROR: Depsolve Error: Get "http://localhost/api/v1/blueprints/depsolve/contest_blueprint": EOF
2024-07-09 15:37:43 test.py:14: lib.osbuild.composer_cli:410: running: composer-cli blueprints depsolve contest_blueprint
2024-07-09 15:37:52 threading.py:937: lib.util.httpsrv._BackgroundHTTPServerHandler.log_message:73: 127.0.0.1:8091: "GET /repo/repodata/repomd.xml HTTP/1.1" 200 -
2024-07-09 15:37:52 threading.py:937: lib.util.httpsrv._BackgroundHTTPServerHandler.log_message:73: 127.0.0.1:8091: "GET /repo/repodata/618e941c41fa778ae0ee2967d2f0086a47fe5074849d9788f30b4b1b1ee200b4-primary.xml.gz HTTP/1.1" 200 -
2024-07-09 15:37:52 threading.py:937: lib.util.httpsrv._BackgroundHTTPServerHandler.log_message:73: 127.0.0.1:8091: "GET /repo/repodata/274f263f29055322e86b4f480ccef1705ef0d7409a478bdf7c0f4f263b8e5837-filelists.xml.gz HTTP/1.1" 200 -
2024-07-09 15:37:52 threading.py:937: lib.util.httpsrv._BackgroundHTTPServerHandler.log_message:73: 127.0.0.1:8091: "GET /repo/repodata/repomd.xml HTTP/1.1" 200 -
2024-07-09 15:37:52 threading.py:937: lib.util.httpsrv._BackgroundHTTPServerHandler.log_message:73: 127.0.0.1:8091: "GET /repo/repodata/618e941c41fa778ae0ee2967d2f0086a47fe5074849d9788f30b4b1b1ee200b4-primary.xml.gz HTTP/1.1" 200 -
2024-07-09 15:37:52 threading.py:937: lib.util.httpsrv._BackgroundHTTPServerHandler.log_message:73: 127.0.0.1:8091: "GET /repo/repodata/274f263f29055322e86b4f480ccef1705ef0d7409a478bdf7c0f4f263b8e5837-filelists.xml.gz HTTP/1.1" 200 -
blueprint: contest_blueprint v0.1.74
    aide-0.16-100.el9.x86_64
    checkpolicy-3.6-1.el9.x86_64
    1:libmicrohttpd-0.9.72-5.el9.x86_64
...