redpanda-data / redpanda

Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
https://redpanda.com
9.37k stars 576 forks source link

CI Failure ('_logs.find(cfg.ntp()) == _logs.end()' cannot double register same ntp) in `NodesDecommissioningTest.test_decommissioning_and_upgrade` #21425

Closed vbotbuildovich closed 1 month ago

vbotbuildovich commented 1 month ago

https://buildkite.com/redpanda/redpanda/builds/51502

Module: rptest.tests.nodes_decommissioning_test
Class: NodesDecommissioningTest
Method: test_decommissioning_and_upgrade
test_id:    NodesDecommissioningTest.test_decommissioning_and_upgrade
status:     FAIL
run time:   123.603 seconds

ConnectionError(MaxRetryError("HTTPConnectionPool(host='docker-rp-30', port=9644): Max retries exceeded with url: /v1/brokers/1 (Caused by ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer')))"))
Traceback (most recent call last):
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 670, in urlopen
    httplib_response = self._make_request(
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 426, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 421, in _make_request
    httplib_response = conn.getresponse()
  File "/usr/lib/python3.10/http/client.py", line 1375, in getresponse
    response.begin()
  File "/usr/lib/python3.10/http/client.py", line 318, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python3.10/http/client.py", line 279, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/usr/lib/python3.10/socket.py", line 705, in readinto
    return self._sock.recv_into(b)
ConnectionResetError: [Errno 104] Connection reset by peer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 726, in urlopen
    retries = retries.increment(
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/urllib3/util/retry.py", line 446, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='docker-rp-30', port=9644): Max retries exceeded with url: /v1/brokers/1 (Caused by ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer')))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/tests/runner_client.py", line 184, in _do_run
    data = self.run_test()
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/tests/runner_client.py", line 276, in run_test
    return self.test_context.function(self.test)
  File "/root/tests/rptest/services/cluster.py", line 105, in wrapped
    r = f(self, *args, **kwargs)
  File "/root/tests/rptest/tests/nodes_decommissioning_test.py", line 925, in test_decommissioning_and_upgrade
    for v in self.upgrade_through_versions(versions_in=versions,
  File "/root/tests/rptest/tests/redpanda_test.py", line 259, in upgrade_through_versions
    self.redpanda.rolling_restart_nodes(
  File "/root/tests/rptest/services/redpanda.py", line 1463, in rolling_restart_nodes
    restarter.restart_nodes(nodes,
  File "/root/tests/rptest/services/rolling_restarter.py", line 89, in restart_nodes
    wait_until(lambda: has_drained_leaders(node),
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/utils/util.py", line 53, in wait_until
    raise e
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/utils/util.py", line 44, in wait_until
    if condition():
  File "/root/tests/rptest/services/rolling_restarter.py", line 89, in <lambda>
    wait_until(lambda: has_drained_leaders(node),
  File "/root/tests/rptest/services/rolling_restarter.py", line 43, in has_drained_leaders
    broker_resp = admin.get_broker(node_id, node=node)
  File "/root/tests/rptest/services/admin.py", line 866, in get_broker
    return self._request('get', f"brokers/{id}", node=node).json()
  File "/root/tests/rptest/services/admin.py", line 640, in _request
    r = self._session.request(verb, url, **kwargs)
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/requests/sessions.py", line 530, in request
    resp = self.send(prep, **send_kwargs)
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/requests/sessions.py", line 643, in send
    r = adapter.send(request, **kwargs)
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/requests/adapters.py", line 516, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='docker-rp-30', port=9644): Max retries exceeded with url: /v1/brokers/1 (Caused by ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer')))

JIRA Link: CORE-5632

dotnwat commented 1 month ago
ERROR 2024-07-15 07:20:12,263 [shard 0:main] assert - Assert failure: (/var/lib/buildkite-agent/builds/buildkite-arm64-builders-i-00c6dae5ee3ecc944-1/redpanda/redpanda/src/v/storage/log_manager.cc:533) '_logs.find(cfg.ntp()) == _logs.end()' cannot double register same ntp
ERROR 2024-07-15 07:20:12,263 [shard 0:main] assert - Backtrace below:
0x8338f1b 0x72ad28b 0x72ab1e3 0x4ff0eeb 0x5b7ae0f 0x5b75afb 0x5b7306f 0x5b992ff 0x2e4003b 0x80f0637 0x80f2ceb 0x80f0ebf 0x80141c7 0x8012d1b 0x2d27f4b 0x8383d2f /opt/redpanda/lib/libc.so.6+0x30a1b /opt/redpanda/lib/libc.so.6+0x30afb 0x2d20e6f
   --------
   seastar::internal::coroutine_traits_base<void>::promise_type
michael-redpanda commented 1 month ago

Automatically closing issue to match current state of CORE-5632