Open h-vetinari opened 6 months ago
So the CI in #118 passed both in PR and after the merge (5 individual runs each), which leads me to believe that this issue is solved by:
diff --git a/python/pyarrow/tests/conftest.py b/python/pyarrow/tests/conftest.py
index 57bc3c8fc..f9396b898 100644
--- a/python/pyarrow/tests/conftest.py
+++ b/python/pyarrow/tests/conftest.py
@@ -192,7 +192,7 @@ def retry(attempts=3, delay=1.0, max_delay=None, backoff=1):
@pytest.fixture(scope='session')
def s3_server(s3_connection, tmpdir_factory):
- @retry(attempts=5, delay=0.1, backoff=2)
+ @retry(attempts=10, delay=1, backoff=2)
def minio_server_health_check(address):
resp = urllib.request.urlopen(f"http://{address}/minio/health/cluster")
assert resp.getcode() == 20
Would you consider merging this upstream @raulcd @assignUser @kou? Given that s3_server
is only initialized once per session, I think it's no issue to take a bit more time? It probably also works without the bump in attempts
already, that should keep the maximum failure time for the fixture setup limited to 1+2+4+8+16=31sec.
Yeah, looks good. I don't think there is an issue with the fix as it's only in a test and won't change anything for users :)
This has been going on for a while; when we catch a slow agent, the tests on aarch fail as follows:
I cannot actually see the URL, but it all seems to be s3 related, and there's the following stack trace (I haven't verified if it applies to all failures, but probably):
Usually restarting once or twice solves the issue, but it's getting annoying through the sheer amount of times it's happening.
The function
minio_server_health_check
already has a built-in retry/back-off, so I think it's worth trying to increase that (at least for the feedstock here).