redpanda-data / redpanda

Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
https://redpanda.com
9.48k stars 580 forks source link

CI Failure (error dialing backend: No agent available) in `OMBValidationTest.test_max_connections` #18823

Closed vbotbuildovich closed 1 week ago

vbotbuildovich commented 3 months ago

https://buildkite.com/redpanda/vtools/builds/14227

Module: rptest.redpanda_cloud_tests.omb_validation_test
Class: OMBValidationTest
Method: test_max_connections
test_id:    OMBValidationTest.test_max_connections
status:     FAIL
run time:   2710.353 seconds

CalledProcessError(1, ['tsh', 'ssh', '--proxy=proxy.tp.redpanda.com:443', '--auth=okta', '--identity=/tmp/machine-id/identity', 'redpanda@cpd2mlkfak2knttpno10-agent', 'kubectl', 'exec', 'rp-cpd2mlkfak2knttpno10-5', '-n=redpanda', '-c=redpanda', '--', 'bash', '-c', '"curl -f -s -S http://localhost:9644/metrics"'], '', 'Error from server: error dialing backend: No agent available\n\x1b[31mERROR: \x1b[0mProcess exited with status 1\n\n')
Traceback (most recent call last):
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/tests/runner_client.py", line 184, in _do_run
    data = self.run_test()
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/tests/runner_client.py", line 276, in run_test
    return self.test_context.function(self.test)
  File "/home/ubuntu/redpanda/tests/rptest/services/cluster.py", line 105, in wrapped
    r = f(self, *args, **kwargs)
  File "/home/ubuntu/redpanda/tests/rptest/redpanda_cloud_tests/omb_validation_test.py", line 426, in test_max_connections
    self.redpanda.wait_until(target_connections_reached,
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 1004, in wait_until
    wait_until(wrapped,
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/utils/util.py", line 53, in wait_until
    raise e
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/utils/util.py", line 44, in wait_until
    if condition():
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 997, in wrapped
    r = fn()
  File "/home/ubuntu/redpanda/tests/rptest/redpanda_cloud_tests/omb_validation_test.py", line 416, in target_connections_reached
    ccount = self._connection_count()
  File "/home/ubuntu/redpanda/tests/rptest/redpanda_cloud_tests/omb_validation_test.py", line 733, in _connection_count
    return self.redpanda.metric_sum(ACTIVE_METRIC)
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 2011, in metric_sum
    return self._metric_sum(metric_name, pods, metrics_endpoint, namespace,
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 1160, in _metric_sum
    metrics = self.metrics(n, metrics_endpoint=metrics_endpoint)
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 1949, in metrics
    text = self.kubectl.exec(
  File "/home/ubuntu/redpanda/tests/rptest/clients/kubectl.py", line 275, in exec
    return self._ssh_cmd(cmd)  # type: ignore
  File "/home/ubuntu/redpanda/tests/rptest/clients/kubectl.py", line 235, in _ssh_cmd
    return self._local_cmd(local_cmd)
  File "/home/ubuntu/redpanda/tests/rptest/clients/kubectl.py", line 215, in _local_cmd
    raise subprocess.CalledProcessError(process.returncode, cmd, s_out,
subprocess.CalledProcessError: Command '['tsh', 'ssh', '--proxy=proxy.tp.redpanda.com:443', '--auth=okta', '--identity=/tmp/machine-id/identity', 'redpanda@cpd2mlkfak2knttpno10-agent', 'kubectl', 'exec', 'rp-cpd2mlkfak2knttpno10-5', '-n=redpanda', '-c=redpanda', '--', 'bash', '-c', '"curl -f -s -S http://localhost:9644/metrics"']' returned non-zero exit status 1.

JIRA Link: CORE-3227

vbotbuildovich commented 3 months ago

*https://buildkite.com/redpanda/vtools/builds/14210

travisdowns commented 3 months ago

This one is a bit different than most of the others: Error from server: error dialing backend: No agent available

ISTM this error is from GKE via kubectl, i.e., it is not talking about RP agent there. Maybe related to https://github.com/redpanda-data/core-internal/issues/1273.

piyushredpanda commented 1 week ago

Closing older-bot-filed CI issues as we transition to a more reliable system.