redpanda-data / redpanda

Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
https://redpanda.com
9.38k stars 577 forks source link

CI Failure (timeout while reading debug/controller_status endpoint) in `ControllerSnapshotPolicyTest.test_upgrade_auto_enable` #18611

Open vbotbuildovich opened 3 months ago

vbotbuildovich commented 3 months ago

https://buildkite.com/redpanda/vtools/builds/13769

Module: rptest.tests.controller_snapshot_test
Class: ControllerSnapshotPolicyTest
Method: test_upgrade_auto_enable
test_id:    ControllerSnapshotPolicyTest.test_upgrade_auto_enable
status:     FAIL
run time:   176.638 seconds

ConnectionError(MaxRetryError('HTTPConnectionPool(host=\'ducktape-node-16-absolutely-meet-mackerel\', port=9644): Max retries exceeded with url: /v1/debug/controller_status (Caused by ReadTimeoutError("HTTPConnectionPool(host=\'ducktape-node-16-absolutely-meet-mackerel\', port=9644): Read timed out. (read timeout=30)"))'))
Traceback (most recent call last):
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 426, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 421, in _make_request
    httplib_response = conn.getresponse()
  File "/usr/lib/python3.10/http/client.py", line 1375, in getresponse
    response.begin()
  File "/usr/lib/python3.10/http/client.py", line 318, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python3.10/http/client.py", line 279, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/usr/lib/python3.10/socket.py", line 705, in readinto
    return self._sock.recv_into(b)
TimeoutError: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 670, in urlopen
    httplib_response = self._make_request(
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 428, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 335, in _raise_timeout
    raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPConnectionPool(host='ducktape-node-16-absolutely-meet-mackerel', port=9644): Read timed out. (read timeout=30)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 726, in urlopen
    retries = retries.increment(
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/urllib3/util/retry.py", line 446, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='ducktape-node-16-absolutely-meet-mackerel', port=9644): Max retries exceeded with url: /v1/debug/controller_status (Caused by ReadTimeoutError("HTTPConnectionPool(host='ducktape-node-16-absolutely-meet-mackerel', port=9644): Read timed out. (read timeout=30)"))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/tests/runner_client.py", line 184, in _do_run
    data = self.run_test()
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/tests/runner_client.py", line 276, in run_test
    return self.test_context.function(self.test)
  File "/home/ubuntu/redpanda/tests/rptest/services/cluster.py", line 103, in wrapped
    r = f(self, *args, **kwargs)
  File "/home/ubuntu/redpanda/tests/rptest/tests/controller_snapshot_test.py", line 102, in test_upgrade_auto_enable
    self.redpanda.wait_for_controller_snapshot(n)
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 4422, in wait_for_controller_snapshot
    return wait_until_result(check, timeout_sec=30, backoff_sec=1)
  File "/home/ubuntu/redpanda/tests/rptest/util.py", line 94, in wait_until_result
    wait_until(wrapped_condition, *args, **kwargs)
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/utils/util.py", line 53, in wait_until
    raise e
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/utils/util.py", line 44, in wait_until
    if condition():
  File "/home/ubuntu/redpanda/tests/rptest/util.py", line 81, in wrapped_condition
    cond = condition()
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 4414, in check
    controller_status = self._admin.get_controller_status(node)
  File "/home/ubuntu/redpanda/tests/rptest/services/admin.py", line 1230, in get_controller_status
    return self._request("GET", f"debug/controller_status",
  File "/home/ubuntu/redpanda/tests/rptest/services/admin.py", line 545, in _request
    r = self._session.request(verb, url, **kwargs)
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/requests/sessions.py", line 530, in request
    resp = self.send(prep, **send_kwargs)
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/requests/sessions.py", line 643, in send
    r = adapter.send(request, **kwargs)
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/requests/adapters.py", line 516, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='ducktape-node-16-absolutely-meet-mackerel', port=9644): Max retries exceeded with url: /v1/debug/controller_status (Caused by ReadTimeoutError("HTTPConnectionPool(host='ducktape-node-16-absolutely-meet-mackerel', port=9644): Read timed out. (read timeout=30)"))

JIRA Link: CORE-3054

travisdowns commented 2 months ago

May not be controller related, we can find some other similar http timeouts spread around in other issues.