redpanda-data / redpanda

Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
https://redpanda.com
9.64k stars 587 forks source link

CI Failure (key symptom) in `OMBValidationTest.test_max_partitions` #21441

Closed vbotbuildovich closed 1 month ago

vbotbuildovich commented 3 months ago

https://buildkite.com/redpanda/vtools/builds/15646 https://buildkite.com/redpanda/vtools/builds/15652

Module: rptest.redpanda_cloud_tests.omb_validation_test
Class: OMBValidationTest
Method: test_max_partitions
test_id:    OMBValidationTest.test_max_partitions
status:     FAIL
run time:   206.701 seconds

RpkException('command /opt/redpanda/bin/rpk cloud byoc aws apply --redpanda-id=cqb0cfoeee8cb0pjh0ng -v returned 1, output: ', '05:34:02.923  DEBUG  logging in using client credential flow\n05:34:02.923  DEBUG  Your existing auth token is still valid, avoiding re-authentication.\n05:34:02.930  DEBUG  looking for existing byoc plugin  {"exists": false}\n05:34:02.930  DEBUG  requesting GET https://cloud-api.ppd.cloud.redpanda.com/api/v1/clusters/cqb0cfoeee8cb0pjh0ng\n05:34:03.142  DEBUG  got response for GET https://cloud-api.ppd.cloud.redpanda.com/api/v1/clusters/cqb0cfoeee8cb0pjh0ng: 200 OK\n05:34:03.142  DEBUG  requesting GET https://cloud-api.ppd.cloud.redpanda.com/api/v1/clusters-resources/install-pack-versions/24.1.20240712161218\n05:34:03.251  DEBUG  got response for GET https://cloud-api.ppd.cloud.redpanda.com/api/v1/clusters-resources/install-pack-versions/24.1.20240712161218: 200 OK\n05:34:03.300  DEBUG  downloading byoc plugin  {"version": "24.1.20240712161218-sha256.73ec33b8290858510e75d171c0e6a5ac"}\n05:34:03.300  DEBUG  requesting GET https://dl.redpanda.com/public/rpk-plugins-preprod/raw/names/byoc-linux-amd64/versions/24.1.20240712161218-sha256.73ec33b8290858510e75d171c0e6a5ac/byoc-linux-amd64-24.1.20240712161218-sha256.73ec33b8290858510e75d171c0e6a5ac.tar.gz\n05:34:03.337  DEBUG  got response for GET https://dl.redpanda.com/public/rpk-plugins-preprod/raw/names/byoc-linux-amd64/versions/24.1.20240712161218-sha256.73ec33b8290858510e75d171c0e6a5ac/byoc-linux-amd64-24.1.20240712161218-sha256.73ec33b8290858510e75d171c0e6a5ac.tar.gz: 200 OK\n05:34:04.468  DEBUG  writing byoc plugin to /home/ubuntu/.local/bin\n2024-07-16T05:34:07.627Z\t\x1b[34mINFO\x1b[0m\t.rpk.managed-byoc\taws/apply.go:114\tReconciling agent infrastructure...\n2024-07-16T05:34:07.818Z\t\x1b[34mINFO\x1b[0m\t.rpk.managed-byoc\tcli/cli.go:194\tRunning apply\t{"provisioner": "redpanda-bootstrap"}\n2024-07-16T05:34:15.699Z\t\x1b[33mWARN\x1b[0m\t.rpk.managed-byoc.bootstrap\taws/bootstrap.go:114\tapply failed, attempting to re-construct local state\t{"error": "failed running terraform apply: exit status 1\\n\\nError: creating Amazon S3 (Simple Storage) Bucket (rp-381492114165-us-west-2-mgmt): BucketAlreadyOwnedByYou: Your previous request to create the named bucket succeeded and you already own it.\\n\\tstatus code: 409, request id: 84FY2TXAFTHV0VZZ, host id: sYMzPtj9xyLO37YmEZg/XCnPN+SLmKgpS2wm5/ga9NTLj54rJQNVHaoLwN0Aby82XTPg4NPafmg=\\n\\n  with aws_s3_bucket.management[0],\\n  on main.tf line 35, in resource \\"aws_s3_bucket\\" \\"management\\":\\n  35: resource \\"aws_s3_bucket\\" \\"management\\" {\\n\\n\\nError: creating Amazon DynamoDB Table (rp-381492114165-us-west-2-mgmt-tflock): ResourceInUseException: Table already exists: rp-381492114165-us-west-2-mgmt-tflock\\n\\n  with aws_dynamodb_table.terraform_locks[0],\\n  on main.tf line 91, in resource \\"aws_dynamodb_table\\" \\"terraform_locks\\":\\n  91: resource \\"aws_dynamodb_table\\" \\"terraform_locks\\" {\\n\\n"}\n2024-07-16T05:34:30.240Z\t\x1b[34mINFO\x1b[0m\t.rpk.managed-byoc\tcli/cli.go:204\tFinished apply\t{"provisioner": "redpanda-bootstrap"}\n2024-07-16T05:34:30.240Z\t\x1b[34mINFO\x1b[0m\t.rpk.managed-byoc\tcli/cli.go:194\tRunning apply\t{"provisioner": "redpanda-network"}\n2024-07-16T05:37:08.347Z\t\x1b[31mERROR\x1b[0m\t.rpk.managed-byoc\tcli/cli.go:197\tFailed to apply provisioner\t{"provisioner": "redpanda-network", "error": "failed running terraform apply: exit status 1\\n\\nError: creating EC2 VPC Endpoint (com.amazonaws.us-west-2.s3): VpcEndpointLimitExceeded: The maximum number of VPC endpoints has been reached.\\n\\tstatus code: 400, request id: f89549bf-4fd8-4cc9-8ffb-2747680303e8\\n\\n  with module.network[0].aws_vpc_endpoint.s3,\\n  on ../../modules/terraform-aws-redpanda-network/network.tf line 41, in resource \\"aws_vpc_endpoint\\" \\"s3\\":\\n  41: resource \\"aws_vpc_endpoint\\" \\"s3\\" {\\n\\n"}\nFailed to apply provisioners: failed to apply provisioner redpanda-network: failed running terraform apply: exit status 1\n\nError: creating EC2 VPC Endpoint (com.amazonaws.us-west-2.s3): VpcEndpointLimitExceeded: The maximum number of VPC endpoints has been reached.\n\tstatus code: 400, request id: f89549bf-4fd8-4cc9-8ffb-2747680303e8\n\n  with module.network[0].aws_vpc_endpoint.s3,\n  on ../../modules/terraform-aws-redpanda-network/network.tf line 41, in resource "aws_vpc_endpoint" "s3":\n  41: resource "aws_vpc_endpoint" "s3" {\n\n\n', 1, '')
Traceback (most recent call last):
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/tests/runner_client.py", line 177, in _do_run
    self.test = self.test_context.cls(self.test_context)
  File "/home/ubuntu/redpanda/tests/rptest/redpanda_cloud_tests/omb_validation_test.py", line 156, in __init__
    super().__init__(test_ctx, *args, **kwargs)
  File "/home/ubuntu/redpanda/tests/rptest/tests/redpanda_cloud_test.py", line 25, in __init__
    self.redpanda = make_redpanda_cloud_service(test_context)
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 5093, in make_redpanda_cloud_service
    return RedpandaServiceCloud(context,
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 1695, in __init__
    cluster_id = self._cloud_cluster.create(superuser=self._superuser)
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda_cloud.py", line 926, in create
    self._create_new_cluster()
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda_cloud.py", line 700, in _create_new_cluster
    self.utils.rpk_cloud_apply(_cluster_id)
  File "/home/ubuntu/redpanda/tests/rptest/services/cloud_cluster_utils.py", line 136, in rpk_cloud_apply
    out = self._exec(cmd, timeout=1800)
  File "/home/ubuntu/redpanda/tests/rptest/services/cloud_cluster_utils.py", line 107, in _exec
    return self.rpk._execute(cmd, env=self.env, timeout=timeout)
  File "/home/ubuntu/redpanda/tests/rptest/clients/rpk.py", line 1152, in _execute
    raise RpkException(
rptest.clients.rpk.RpkException: RpkException<command /opt/redpanda/bin/rpk cloud byoc aws apply --redpanda-id=cqb0cfoeee8cb0pjh0ng -v returned 1, output: ; stderr: 05:34:02.923  DEBUG  logging in using client credential flow
05:34:02.923  DEBUG  Your existing auth token is still valid, avoiding re-authentication.
05:34:02.930  DEBUG  looking for existing byoc plugin  {"exists": false}
05:34:02.930  DEBUG  requesting GET https://cloud-api.ppd.cloud.redpanda.com/api/v1/clusters/cqb0cfoeee8cb0pjh0ng
05:34:03.142  DEBUG  got response for GET https://cloud-api.ppd.cloud.redpanda.com/api/v1/clusters/cqb0cfoeee8cb0pjh0ng: 200 OK
05:34:03.142  DEBUG  requesting GET https://cloud-api.ppd.cloud.redpanda.com/api/v1/clusters-resources/install-pack-versions/24.1.20240712161218
05:34:03.251  DEBUG  got response for GET https://cloud-api.ppd.cloud.redpanda.com/api/v1/clusters-resources/install-pack-versions/24.1.20240712161218: 200 OK
05:34:03.300  DEBUG  downloading byoc plugin  {"version": "24.1.20240712161218-sha256.73ec33b8290858510e75d171c0e6a5ac"}
05:34:03.300  DEBUG  requesting GET https://dl.redpanda.com/public/rpk-plugins-preprod/raw/names/byoc-linux-amd64/versions/24.1.20240712161218-sha256.73ec33b8290858510e75d171c0e6a5ac/byoc-linux-amd64-24.1.20240712161218-sha256.73ec33b8290858510e75d171c0e6a5ac.tar.gz
05:34:03.337  DEBUG  got response for GET https://dl.redpanda.com/public/rpk-plugins-preprod/raw/names/byoc-linux-amd64/versions/24.1.20240712161218-sha256.73ec33b8290858510e75d171c0e6a5ac/byoc-linux-amd64-24.1.20240712161218-sha256.73ec33b8290858510e75d171c0e6a5ac.tar.gz: 200 OK
05:34:04.468  DEBUG  writing byoc plugin to /home/ubuntu/.local/bin
2024-07-16T05:34:07.627Z    INFO   .rpk.managed-byoc   aws/apply.go:114    Reconciling agent infrastructure...
2024-07-16T05:34:07.818Z    INFO   .rpk.managed-byoc   cli/cli.go:194  Running apply   {"provisioner": "redpanda-bootstrap"}
2024-07-16T05:34:15.699Z    WARN   .rpk.managed-byoc.bootstrap aws/bootstrap.go:114    apply failed, attempting to re-construct local state    {"error": "failed running terraform apply: exit status 1\n\nError: creating Amazon S3 (Simple Storage) Bucket (rp-381492114165-us-west-2-mgmt): BucketAlreadyOwnedByYou: Your previous request to create the named bucket succeeded and you already own it.\n\tstatus code: 409, request id: 84FY2TXAFTHV0VZZ, host id: sYMzPtj9xyLO37YmEZg/XCnPN+SLmKgpS2wm5/ga9NTLj54rJQNVHaoLwN0Aby82XTPg4NPafmg=\n\n  with aws_s3_bucket.management[0],\n  on main.tf line 35, in resource \"aws_s3_bucket\" \"management\":\n  35: resource \"aws_s3_bucket\" \"management\" {\n\n\nError: creating Amazon DynamoDB Table (rp-381492114165-us-west-2-mgmt-tflock): ResourceInUseException: Table already exists: rp-381492114165-us-west-2-mgmt-tflock\n\n  with aws_dynamodb_table.terraform_locks[0],\n  on main.tf line 91, in resource \"aws_dynamodb_table\" \"terraform_locks\":\n  91: resource \"aws_dynamodb_table\" \"terraform_locks\" {\n\n"}
2024-07-16T05:34:30.240Z    INFO   .rpk.managed-byoc   cli/cli.go:204  Finished apply  {"provisioner": "redpanda-bootstrap"}
2024-07-16T05:34:30.240Z    INFO   .rpk.managed-byoc   cli/cli.go:194  Running apply   {"provisioner": "redpanda-network"}
2024-07-16T05:37:08.347Z    ERROR  .rpk.managed-byoc   cli/cli.go:197  Failed to apply provisioner {"provisioner": "redpanda-network", "error": "failed running terraform apply: exit status 1\n\nError: creating EC2 VPC Endpoint (com.amazonaws.us-west-2.s3): VpcEndpointLimitExceeded: The maximum number of VPC endpoints has been reached.\n\tstatus code: 400, request id: f89549bf-4fd8-4cc9-8ffb-2747680303e8\n\n  with module.network[0].aws_vpc_endpoint.s3,\n  on ../../modules/terraform-aws-redpanda-network/network.tf line 41, in resource \"aws_vpc_endpoint\" \"s3\":\n  41: resource \"aws_vpc_endpoint\" \"s3\" {\n\n"}
Failed to apply provisioners: failed to apply provisioner redpanda-network: failed running terraform apply: exit status 1

Error: creating EC2 VPC Endpoint (com.amazonaws.us-west-2.s3): VpcEndpointLimitExceeded: The maximum number of VPC endpoints has been reached.
    status code: 400, request id: f89549bf-4fd8-4cc9-8ffb-2747680303e8

  with module.network[0].aws_vpc_endpoint.s3,
  on ../../modules/terraform-aws-redpanda-network/network.tf line 41, in resource "aws_vpc_endpoint" "s3":
  41: resource "aws_vpc_endpoint" "s3" {

; returncode: 1>

JIRA Link: CORE-5651

dotnwat commented 3 months ago

duplicate of https://github.com/redpanda-data/redpanda/issues/21439

vbotbuildovich commented 3 months ago

https://buildkite.com/redpanda/vtools/builds/15824 https://buildkite.com/redpanda/vtools/builds/15849

vbotbuildovich commented 3 months ago

https://buildkite.com/redpanda/vtools/builds/15922 https://buildkite.com/redpanda/vtools/builds/15923 *https://buildkite.com/redpanda/vtools/builds/15927

michael-redpanda commented 3 months ago

Automatically closing issue to match current state of CORE-5651

vbotbuildovich commented 3 months ago

*https://buildkite.com/redpanda/vtools/builds/15951

vbotbuildovich commented 3 months ago

https://buildkite.com/redpanda/vtools/builds/15975 https://buildkite.com/redpanda/vtools/builds/15976 https://buildkite.com/redpanda/vtools/builds/15999 https://buildkite.com/redpanda/vtools/builds/16012 https://buildkite.com/redpanda/vtools/builds/16025 https://buildkite.com/redpanda/vtools/builds/16051 *https://buildkite.com/redpanda/vtools/builds/16058

vbotbuildovich commented 3 months ago

https://buildkite.com/redpanda/vtools/builds/16073 https://buildkite.com/redpanda/vtools/builds/16072 *https://buildkite.com/redpanda/vtools/builds/16077

vbotbuildovich commented 3 months ago

https://buildkite.com/redpanda/vtools/builds/16095 https://buildkite.com/redpanda/vtools/builds/16096 *https://buildkite.com/redpanda/vtools/builds/16100

vbotbuildovich commented 3 months ago

https://buildkite.com/redpanda/vtools/builds/16113 https://buildkite.com/redpanda/vtools/builds/16114

vbotbuildovich commented 3 months ago

*https://buildkite.com/redpanda/vtools/builds/16142

vbotbuildovich commented 3 months ago

*https://buildkite.com/redpanda/vtools/builds/16165

vbotbuildovich commented 3 months ago

*https://buildkite.com/redpanda/vtools/builds/16179

vbotbuildovich commented 3 months ago

https://buildkite.com/redpanda/vtools/builds/16217 https://buildkite.com/redpanda/vtools/builds/16224

piyushredpanda commented 1 month ago

Closing older-bot-filed CI issues as we transition to a more reliable system.