redpanda-data / redpanda

Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
https://redpanda.com
9.38k stars 577 forks source link

CI Failure (key symptom) in `TieredStorageCacheStressTest.streaming_cache_test` #20878

Closed vbotbuildovich closed 1 month ago

vbotbuildovich commented 2 months ago

https://buildkite.com/redpanda/vtools/builds/15156

Module: rptest.scale_tests.tiered_storage_cache_stress_test
Class: TieredStorageCacheStressTest
Method: streaming_cache_test
Arguments: {
    "limit_mode": "objects",
    "log_segment_size": 1048576
}
test_id:    TieredStorageCacheStressTest.streaming_cache_test
status:     FAIL
run time:   972.204 seconds

RemoteCommandError({'ssh_config': {'host': 'ip-172-31-11-187', 'hostname': '172.31.11.187', 'user': 'root', 'port': 22, 'password': None, 'identityfile': '/home/ubuntu/.ssh/id_rsa'}, 'hostname': 'ip-172-31-11-187', 'ssh_hostname': '172.31.11.187', 'user': 'root', 'externally_routable_ip': '18.246.63.89', '_logger': <Logger rptest.scale_tests.tiered_storage_cache_stress_test.TieredStorageCacheStressTest.streaming_cache_test.limit_mode=LimitMode.objects.log_segment_size=1048576-671 (DEBUG)>, 'os': 'linux', '_ssh_client': <paramiko.client.SSHClient object at 0xf53c5bc42800>, '_sftp_client': <paramiko.sftp_client.SFTPClient object at 0xf53c5baebdf0>, '_custom_ssh_exception_checks': None}, 'du -s "/var/lib/redpanda/data/cloud_storage_cache"', 1, b"du: cannot access '/var/lib/redpanda/data/cloud_storage_cache/accesstime.tmp': No such file or directory\n")
Traceback (most recent call last):
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/tests/runner_client.py", line 184, in _do_run
    data = self.run_test()
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/tests/runner_client.py", line 276, in run_test
    return self.test_context.function(self.test)
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/mark/_mark.py", line 535, in wrapper
    return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
  File "/home/ubuntu/redpanda/tests/rptest/services/cluster.py", line 105, in wrapped
    r = f(self, *args, **kwargs)
  File "/home/ubuntu/redpanda/tests/rptest/scale_tests/tiered_storage_cache_stress_test.py", line 295, in streaming_cache_test
    nodes_cache_used = self.redpanda.for_nodes(
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 1483, in for_nodes
    return list(executor.map(cb, nodes))
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 621, in result_iterator
    yield _result_or_cancel(fs.pop())
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 319, in _result_or_cancel
    return fut.result(timeout)
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 458, in result
    return self.__get_result()
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/ubuntu/redpanda/tests/rptest/scale_tests/tiered_storage_cache_stress_test.py", line 296, in <lambda>
    self.redpanda.nodes, lambda n: self._validate_node_storage(
  File "/home/ubuntu/redpanda/tests/rptest/scale_tests/tiered_storage_cache_stress_test.py", line 96, in _validate_node_storage
    node_storage = self.redpanda.node_storage(node)
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 4242, in node_storage
    node.account.ssh_output(
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/cluster/remoteaccount.py", line 41, in wrapper
    return method(self, *args, **kwargs)
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/cluster/remoteaccount.py", line 397, in ssh_output
    raise RemoteCommandError(self, cmd, exit_status, stderr.read())
ducktape.cluster.remoteaccount.RemoteCommandError: root@ip-172-31-11-187: Command 'du -s "/var/lib/redpanda/data/cloud_storage_cache"' returned non-zero exit status 1. Remote error message: b"du: cannot access '/var/lib/redpanda/data/cloud_storage_cache/accesstime.tmp': No such file or directory\n"

JIRA Link: CORE-5129

vbotbuildovich commented 2 months ago

*https://buildkite.com/redpanda/vtools/builds/15352

michael-redpanda commented 1 month ago

Automatically closing issue to match current state of CORE-5129