redpanda-data / redpanda

Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
https://redpanda.com
9.38k stars 577 forks source link

CI Failure (key symptom) in `RedpandaCloudSelfTest.test_healthy` #21439

Open vbotbuildovich opened 1 month ago

vbotbuildovich commented 1 month ago

https://buildkite.com/redpanda/vtools/builds/15652

Module: rptest.redpanda_cloud_tests.cloud_self_test
Class: RedpandaCloudSelfTest
Method: test_healthy
test_id:    RedpandaCloudSelfTest.test_healthy
status:     FAIL
run time:   181.844 seconds

RpkException('command /opt/redpanda/bin/rpk cloud byoc aws apply --redpanda-id=cqb3ig7on5gi3v6ef6k0 -v returned 1, output: ', '09:11:38.672  DEBUG  logging in using client credential flow\n09:11:38.672  DEBUG  Your existing auth token is still valid, avoiding re-authentication.\n09:11:38.680  DEBUG  looking for existing byoc plugin  {"exists": false}\n09:11:38.680  DEBUG  requesting GET https://cloud-api.ign.cloud.redpanda.com/api/v1/clusters/cqb3ig7on5gi3v6ef6k0\n09:11:38.899  DEBUG  got response for GET https://cloud-api.ign.cloud.redpanda.com/api/v1/clusters/cqb3ig7on5gi3v6ef6k0: 200 OK\n09:11:38.899  DEBUG  requesting GET https://cloud-api.ign.cloud.redpanda.com/api/v1/clusters-resources/install-pack-versions/24.1.20240716082057\n09:11:39.124  DEBUG  got response for GET https://cloud-api.ign.cloud.redpanda.com/api/v1/clusters-resources/install-pack-versions/24.1.20240716082057: 200 OK\n09:11:39.175  DEBUG  downloading byoc plugin  {"version": "24.1.20240716082057-sha256.3dd24c68e220a8023eb8505f75af2e7d"}\n09:11:39.175  DEBUG  requesting GET https://dl.redpanda.com/public/rpk-plugins-integration/raw/names/byoc-linux-amd64/versions/24.1.20240716082057-sha256.3dd24c68e220a8023eb8505f75af2e7d/byoc-linux-amd64-24.1.20240716082057-sha256.3dd24c68e220a8023eb8505f75af2e7d.tar.gz\n09:11:39.853  DEBUG  got response for GET https://dl.redpanda.com/public/rpk-plugins-integration/raw/names/byoc-linux-amd64/versions/24.1.20240716082057-sha256.3dd24c68e220a8023eb8505f75af2e7d/byoc-linux-amd64-24.1.20240716082057-sha256.3dd24c68e220a8023eb8505f75af2e7d.tar.gz: 200 OK\n09:11:40.968  DEBUG  writing byoc plugin to /home/ubuntu/.local/bin\n2024-07-16T09:11:42.885Z\t\x1b[34mINFO\x1b[0m\t.rpk.managed-byoc\taws/apply.go:114\tReconciling agent infrastructure...\n2024-07-16T09:11:43.072Z\t\x1b[34mINFO\x1b[0m\t.rpk.managed-byoc\tcli/cli.go:194\tRunning apply\t{"provisioner": "redpanda-bootstrap"}\n2024-07-16T09:11:50.861Z\t\x1b[33mWARN\x1b[0m\t.rpk.managed-byoc.bootstrap\taws/bootstrap.go:114\tapply failed, attempting to re-construct local state\t{"error": "failed running terraform apply: exit status 1\\n\\nError: creating Amazon S3 (Simple Storage) Bucket (rp-381492114165-us-west-2-mgmt): BucketAlreadyOwnedByYou: Your previous request to create the named bucket succeeded and you already own it.\\n\\tstatus code: 409, request id: V3D4VR0Q516X22QG, host id: pktTFeten1UhEjLxuKmwBge2FZm654x9kACEk3jamkGymxdXsDtWCkjn7E3iT4V9hNXSEO59maYma8HsjaiFwA==\\n\\n  with aws_s3_bucket.management[0],\\n  on main.tf line 35, in resource \\"aws_s3_bucket\\" \\"management\\":\\n  35: resource \\"aws_s3_bucket\\" \\"management\\" {\\n\\n\\nError: creating Amazon DynamoDB Table (rp-381492114165-us-west-2-mgmt-tflock): ResourceInUseException: Table already exists: rp-381492114165-us-west-2-mgmt-tflock\\n\\n  with aws_dynamodb_table.terraform_locks[0],\\n  on main.tf line 91, in resource \\"aws_dynamodb_table\\" \\"terraform_locks\\":\\n  91: resource \\"aws_dynamodb_table\\" \\"terraform_locks\\" {\\n\\n"}\n2024-07-16T09:12:05.137Z\t\x1b[34mINFO\x1b[0m\t.rpk.managed-byoc\tcli/cli.go:204\tFinished apply\t{"provisioner": "redpanda-bootstrap"}\n2024-07-16T09:12:05.138Z\t\x1b[34mINFO\x1b[0m\t.rpk.managed-byoc\tcli/cli.go:194\tRunning apply\t{"provisioner": "redpanda-network"}\n2024-07-16T09:14:21.623Z\t\x1b[31mERROR\x1b[0m\t.rpk.managed-byoc\tcli/cli.go:197\tFailed to apply provisioner\t{"provisioner": "redpanda-network", "error": "failed running terraform apply: exit status 1\\n\\nError: creating EC2 VPC Endpoint (com.amazonaws.us-west-2.s3): VpcEndpointLimitExceeded: The maximum number of VPC endpoints has been reached.\\n\\tstatus code: 400, request id: 8c4b8948-a5b2-47b6-9bee-a4e751755d78\\n\\n  with module.network[0].aws_vpc_endpoint.s3,\\n  on ../../modules/terraform-aws-redpanda-network/network.tf line 41, in resource \\"aws_vpc_endpoint\\" \\"s3\\":\\n  41: resource \\"aws_vpc_endpoint\\" \\"s3\\" {\\n\\n"}\nFailed to apply provisioners: failed to apply provisioner redpanda-network: failed running terraform apply: exit status 1\n\nError: creating EC2 VPC Endpoint (com.amazonaws.us-west-2.s3): VpcEndpointLimitExceeded: The maximum number of VPC endpoints has been reached.\n\tstatus code: 400, request id: 8c4b8948-a5b2-47b6-9bee-a4e751755d78\n\n  with module.network[0].aws_vpc_endpoint.s3,\n  on ../../modules/terraform-aws-redpanda-network/network.tf line 41, in resource "aws_vpc_endpoint" "s3":\n  41: resource "aws_vpc_endpoint" "s3" {\n\n\n', 1, '')
Traceback (most recent call last):
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/tests/runner_client.py", line 177, in _do_run
    self.test = self.test_context.cls(self.test_context)
  File "/home/ubuntu/redpanda/tests/rptest/redpanda_cloud_tests/cloud_self_test.py", line 24, in __init__
    super().__init__(test_context, *args, **kwargs)
  File "/home/ubuntu/redpanda/tests/rptest/tests/redpanda_cloud_test.py", line 25, in __init__
    self.redpanda = make_redpanda_cloud_service(test_context)
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 5093, in make_redpanda_cloud_service
    return RedpandaServiceCloud(context,
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 1695, in __init__
    cluster_id = self._cloud_cluster.create(superuser=self._superuser)
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda_cloud.py", line 926, in create
    self._create_new_cluster()
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda_cloud.py", line 700, in _create_new_cluster
    self.utils.rpk_cloud_apply(_cluster_id)
  File "/home/ubuntu/redpanda/tests/rptest/services/cloud_cluster_utils.py", line 136, in rpk_cloud_apply
    out = self._exec(cmd, timeout=1800)
  File "/home/ubuntu/redpanda/tests/rptest/services/cloud_cluster_utils.py", line 107, in _exec
    return self.rpk._execute(cmd, env=self.env, timeout=timeout)
  File "/home/ubuntu/redpanda/tests/rptest/clients/rpk.py", line 1152, in _execute
    raise RpkException(
rptest.clients.rpk.RpkException: RpkException<command /opt/redpanda/bin/rpk cloud byoc aws apply --redpanda-id=cqb3ig7on5gi3v6ef6k0 -v returned 1, output: ; stderr: 09:11:38.672  DEBUG  logging in using client credential flow
09:11:38.672  DEBUG  Your existing auth token is still valid, avoiding re-authentication.
09:11:38.680  DEBUG  looking for existing byoc plugin  {"exists": false}
09:11:38.680  DEBUG  requesting GET https://cloud-api.ign.cloud.redpanda.com/api/v1/clusters/cqb3ig7on5gi3v6ef6k0
09:11:38.899  DEBUG  got response for GET https://cloud-api.ign.cloud.redpanda.com/api/v1/clusters/cqb3ig7on5gi3v6ef6k0: 200 OK
09:11:38.899  DEBUG  requesting GET https://cloud-api.ign.cloud.redpanda.com/api/v1/clusters-resources/install-pack-versions/24.1.20240716082057
09:11:39.124  DEBUG  got response for GET https://cloud-api.ign.cloud.redpanda.com/api/v1/clusters-resources/install-pack-versions/24.1.20240716082057: 200 OK
09:11:39.175  DEBUG  downloading byoc plugin  {"version": "24.1.20240716082057-sha256.3dd24c68e220a8023eb8505f75af2e7d"}
09:11:39.175  DEBUG  requesting GET https://dl.redpanda.com/public/rpk-plugins-integration/raw/names/byoc-linux-amd64/versions/24.1.20240716082057-sha256.3dd24c68e220a8023eb8505f75af2e7d/byoc-linux-amd64-24.1.20240716082057-sha256.3dd24c68e220a8023eb8505f75af2e7d.tar.gz
09:11:39.853  DEBUG  got response for GET https://dl.redpanda.com/public/rpk-plugins-integration/raw/names/byoc-linux-amd64/versions/24.1.20240716082057-sha256.3dd24c68e220a8023eb8505f75af2e7d/byoc-linux-amd64-24.1.20240716082057-sha256.3dd24c68e220a8023eb8505f75af2e7d.tar.gz: 200 OK
09:11:40.968  DEBUG  writing byoc plugin to /home/ubuntu/.local/bin
2024-07-16T09:11:42.885Z    INFO   .rpk.managed-byoc   aws/apply.go:114    Reconciling agent infrastructure...
2024-07-16T09:11:43.072Z    INFO   .rpk.managed-byoc   cli/cli.go:194  Running apply   {"provisioner": "redpanda-bootstrap"}
2024-07-16T09:11:50.861Z    WARN   .rpk.managed-byoc.bootstrap aws/bootstrap.go:114    apply failed, attempting to re-construct local state    {"error": "failed running terraform apply: exit status 1\n\nError: creating Amazon S3 (Simple Storage) Bucket (rp-381492114165-us-west-2-mgmt): BucketAlreadyOwnedByYou: Your previous request to create the named bucket succeeded and you already own it.\n\tstatus code: 409, request id: V3D4VR0Q516X22QG, host id: pktTFeten1UhEjLxuKmwBge2FZm654x9kACEk3jamkGymxdXsDtWCkjn7E3iT4V9hNXSEO59maYma8HsjaiFwA==\n\n  with aws_s3_bucket.management[0],\n  on main.tf line 35, in resource \"aws_s3_bucket\" \"management\":\n  35: resource \"aws_s3_bucket\" \"management\" {\n\n\nError: creating Amazon DynamoDB Table (rp-381492114165-us-west-2-mgmt-tflock): ResourceInUseException: Table already exists: rp-381492114165-us-west-2-mgmt-tflock\n\n  with aws_dynamodb_table.terraform_locks[0],\n  on main.tf line 91, in resource \"aws_dynamodb_table\" \"terraform_locks\":\n  91: resource \"aws_dynamodb_table\" \"terraform_locks\" {\n\n"}
2024-07-16T09:12:05.137Z    INFO   .rpk.managed-byoc   cli/cli.go:204  Finished apply  {"provisioner": "redpanda-bootstrap"}
2024-07-16T09:12:05.138Z    INFO   .rpk.managed-byoc   cli/cli.go:194  Running apply   {"provisioner": "redpanda-network"}
2024-07-16T09:14:21.623Z    ERROR  .rpk.managed-byoc   cli/cli.go:197  Failed to apply provisioner {"provisioner": "redpanda-network", "error": "failed running terraform apply: exit status 1\n\nError: creating EC2 VPC Endpoint (com.amazonaws.us-west-2.s3): VpcEndpointLimitExceeded: The maximum number of VPC endpoints has been reached.\n\tstatus code: 400, request id: 8c4b8948-a5b2-47b6-9bee-a4e751755d78\n\n  with module.network[0].aws_vpc_endpoint.s3,\n  on ../../modules/terraform-aws-redpanda-network/network.tf line 41, in resource \"aws_vpc_endpoint\" \"s3\":\n  41: resource \"aws_vpc_endpoint\" \"s3\" {\n\n"}
Failed to apply provisioners: failed to apply provisioner redpanda-network: failed running terraform apply: exit status 1

Error: creating EC2 VPC Endpoint (com.amazonaws.us-west-2.s3): VpcEndpointLimitExceeded: The maximum number of VPC endpoints has been reached.
    status code: 400, request id: 8c4b8948-a5b2-47b6-9bee-a4e751755d78

  with module.network[0].aws_vpc_endpoint.s3,
  on ../../modules/terraform-aws-redpanda-network/network.tf line 41, in resource "aws_vpc_endpoint" "s3":
  41: resource "aws_vpc_endpoint" "s3" {

; returncode: 1>

JIRA Link: CORE-5649

vbotbuildovich commented 1 month ago

*https://buildkite.com/redpanda/vtools/builds/15824

vbotbuildovich commented 1 month ago

*https://buildkite.com/redpanda/vtools/builds/15849

rpdevmp commented 1 month ago

Based on Buildkite errors, this is a duplicate of https://github.com/redpanda-data/redpanda/issues/21448 (Which already has a fix and added PRs in review in comments)

vbotbuildovich commented 1 month ago

https://buildkite.com/redpanda/vtools/builds/15922 https://buildkite.com/redpanda/vtools/builds/15923 https://buildkite.com/redpanda/vtools/builds/15927 https://buildkite.com/redpanda/vtools/builds/15951 https://buildkite.com/redpanda/vtools/builds/15975 https://buildkite.com/redpanda/vtools/builds/15976

vbotbuildovich commented 1 month ago

*https://buildkite.com/redpanda/vtools/builds/15999

vbotbuildovich commented 1 month ago

*https://buildkite.com/redpanda/vtools/builds/16012

vbotbuildovich commented 1 month ago

*https://buildkite.com/redpanda/vtools/builds/16025

vbotbuildovich commented 1 month ago

https://buildkite.com/redpanda/vtools/builds/16051 https://buildkite.com/redpanda/vtools/builds/16058

vbotbuildovich commented 1 month ago

https://buildkite.com/redpanda/vtools/builds/16073 https://buildkite.com/redpanda/vtools/builds/16072 *https://buildkite.com/redpanda/vtools/builds/16077

vbotbuildovich commented 1 month ago

https://buildkite.com/redpanda/vtools/builds/16095 https://buildkite.com/redpanda/vtools/builds/16096 *https://buildkite.com/redpanda/vtools/builds/16100

vbotbuildovich commented 1 month ago

https://buildkite.com/redpanda/vtools/builds/16113 https://buildkite.com/redpanda/vtools/builds/16114

vbotbuildovich commented 1 month ago

*https://buildkite.com/redpanda/vtools/builds/16142

vbotbuildovich commented 1 month ago

*https://buildkite.com/redpanda/vtools/builds/16165

vbotbuildovich commented 1 month ago

*https://buildkite.com/redpanda/vtools/builds/16179

vbotbuildovich commented 1 month ago

https://buildkite.com/redpanda/vtools/builds/16217 https://buildkite.com/redpanda/vtools/builds/16224