redpanda-data / redpanda

Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
https://redpanda.com
9.39k stars 577 forks source link

CI Failure (key symptom) in `BucketScrubSelfTest.test_missing_segment` #21938

Closed vbotbuildovich closed 1 month ago

vbotbuildovich commented 1 month ago

https://buildkite.com/redpanda/vtools/builds/15928

Module: rptest.tests.services_self_test
Class: BucketScrubSelfTest
Method: test_missing_segment
Arguments: {
    "cloud_storage_type": 1
}
test_id:    BucketScrubSelfTest.test_missing_segment
status:     FAIL
run time:   6.251 seconds

RemoteCommandError({'ssh_config': {'host': 'ip-172-31-9-220', 'hostname': '172.31.9.220', 'user': 'root', 'port': 22, 'password': None, 'identityfile': '/home/ubuntu/.ssh/id_rsa'}, 'hostname': 'ip-172-31-9-220', 'ssh_hostname': '172.31.9.220', 'user': 'root', 'externally_routable_ip': '34.215.120.174', '_logger': <Logger rptest.tests.services_self_test.BucketScrubSelfTest.test_missing_segment.cloud_storage_type=CloudStorageType.S3-750 (DEBUG)>, 'os': 'linux', '_ssh_client': <paramiko.client.SSHClient object at 0x7faf0988fd30>, '_sftp_client': <paramiko.sftp_client.SFTPClient object at 0x7faf0988d600>, '_custom_ssh_exception_checks': None}, 'host ip-172-31-9-220.us-west-2.compute.internal', 127, b'')
Traceback (most recent call last):
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/tests/runner_client.py", line 182, in _do_run
    self.setup_test()
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/tests/runner_client.py", line 267, in setup_test
    self.test.setup()
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/tests/test.py", line 91, in setup
    self.setUp()
  File "/home/ubuntu/redpanda/tests/rptest/tests/redpanda_test.py", line 39, in setUp
    self.__redpanda.start()
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 2719, in start
    self.for_nodes(to_start, start_one)
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 1489, in for_nodes
    return list(executor.map(cb, nodes))
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 621, in result_iterator
    yield _result_or_cancel(fs.pop())
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 319, in _result_or_cancel
    return fut.result(timeout)
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 458, in result
    return self.__get_result()
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 2711, in start_one
    self.start_node(node,
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 3021, in start_node
    self.write_node_conf_file(
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 3947, in write_node_conf_file
    fqdn = self.get_node_fqdn(node)
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 3883, in get_node_fqdn
    fqdn = node.account.ssh_output(
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/cluster/remoteaccount.py", line 41, in wrapper
    return method(self, *args, **kwargs)
ducktape.cluster.remoteaccount.RemoteCommandError: root@ip-172-31-9-220: Command 'host ip-172-31-9-220.us-west-2.compute.internal' returned non-zero exit status 127.

JIRA Link: CORE-6091

rpdevmp commented 1 month ago

Duplicate of #21624 That GH issue has all the details of many tests failing, and what are the steps to avoid similar cases in the future, where many issues got opened based on similar infra errors

vbotbuildovich commented 1 month ago

https://buildkite.com/redpanda/vtools/builds/15980 https://buildkite.com/redpanda/vtools/builds/16003 *https://buildkite.com/redpanda/vtools/builds/16016

vbotbuildovich commented 1 month ago

*https://buildkite.com/redpanda/vtools/builds/16030

rpdevmp commented 1 month ago

Duplicate of #21624 That GH issue has all the details of many tests failing, and what are the steps to avoid similar cases in the future, where many CI failures got opened based on similar infra errors. Updating Github automatically and AWS with FIPS run should be fixed by https://github.com/redpanda-data/vtools/pull/3018