redpanda-data / redpanda

Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
https://redpanda.com
9.7k stars 591 forks source link

CI Failure (key symptom) in `HTObserveTest.test_cloud_observe` #21449

Closed vbotbuildovich closed 2 months ago

vbotbuildovich commented 4 months ago

https://buildkite.com/redpanda/vtools/builds/15646 https://buildkite.com/redpanda/vtools/builds/15652

Module: rptest.redpanda_cloud_tests.observe_test
Class: HTObserveTest
Method: test_cloud_observe
test_id:    HTObserveTest.test_cloud_observe
status:     FAIL
run time:   207.185 seconds

RpkException('command /opt/redpanda/bin/rpk cloud byoc aws apply --redpanda-id=cqb0oq8eee8cb0pjh1a0 -v returned 1, output: ', '06:00:16.066  DEBUG  logging in using client credential flow\n06:00:16.066  DEBUG  Your existing auth token is still valid, avoiding re-authentication.\n06:00:16.074  DEBUG  looking for existing byoc plugin  {"exists": false}\n06:00:16.074  DEBUG  requesting GET https://cloud-api.ppd.cloud.redpanda.com/api/v1/clusters/cqb0oq8eee8cb0pjh1a0\n06:00:16.284  DEBUG  got response for GET https://cloud-api.ppd.cloud.redpanda.com/api/v1/clusters/cqb0oq8eee8cb0pjh1a0: 200 OK\n06:00:16.285  DEBUG  requesting GET https://cloud-api.ppd.cloud.redpanda.com/api/v1/clusters-resources/install-pack-versions/24.1.20240712161218\n06:00:16.387  DEBUG  got response for GET https://cloud-api.ppd.cloud.redpanda.com/api/v1/clusters-resources/install-pack-versions/24.1.20240712161218: 200 OK\n06:00:16.438  DEBUG  downloading byoc plugin  {"version": "24.1.20240712161218-sha256.73ec33b8290858510e75d171c0e6a5ac"}\n06:00:16.438  DEBUG  requesting GET https://dl.redpanda.com/public/rpk-plugins-preprod/raw/names/byoc-linux-amd64/versions/24.1.20240712161218-sha256.73ec33b8290858510e75d171c0e6a5ac/byoc-linux-amd64-24.1.20240712161218-sha256.73ec33b8290858510e75d171c0e6a5ac.tar.gz\n06:00:17.097  DEBUG  got response for GET https://dl.redpanda.com/public/rpk-plugins-preprod/raw/names/byoc-linux-amd64/versions/24.1.20240712161218-sha256.73ec33b8290858510e75d171c0e6a5ac/byoc-linux-amd64-24.1.20240712161218-sha256.73ec33b8290858510e75d171c0e6a5ac.tar.gz: 200 OK\n06:00:18.297  DEBUG  writing byoc plugin to /home/ubuntu/.local/bin\n2024-07-16T06:00:21.053Z\t\x1b[34mINFO\x1b[0m\t.rpk.managed-byoc\taws/apply.go:114\tReconciling agent infrastructure...\n2024-07-16T06:00:21.416Z\t\x1b[34mINFO\x1b[0m\t.rpk.managed-byoc\tcli/cli.go:194\tRunning apply\t{"provisioner": "redpanda-bootstrap"}\n2024-07-16T06:00:29.129Z\t\x1b[33mWARN\x1b[0m\t.rpk.managed-byoc.bootstrap\taws/bootstrap.go:114\tapply failed, attempting to re-construct local state\t{"error": "failed running terraform apply: exit status 1\\n\\nError: creating Amazon S3 (Simple Storage) Bucket (rp-381492114165-us-west-2-mgmt): BucketAlreadyOwnedByYou: Your previous request to create the named bucket succeeded and you already own it.\\n\\tstatus code: 409, request id: 596SC7CRA9VXNM9Y, host id: 96olp6yizXCrp07ivNiZYmuqRJ6cjP7ZP7UqfqhhxL0RKFD7vsIRqOAUSZmojU0wznqeyUtuhfzrjRQ560eWnQ==\\n\\n  with aws_s3_bucket.management[0],\\n  on main.tf line 35, in resource \\"aws_s3_bucket\\" \\"management\\":\\n  35: resource \\"aws_s3_bucket\\" \\"management\\" {\\n\\n\\nError: creating Amazon DynamoDB Table (rp-381492114165-us-west-2-mgmt-tflock): ResourceInUseException: Table already exists: rp-381492114165-us-west-2-mgmt-tflock\\n\\n  with aws_dynamodb_table.terraform_locks[0],\\n  on main.tf line 91, in resource \\"aws_dynamodb_table\\" \\"terraform_locks\\":\\n  91: resource \\"aws_dynamodb_table\\" \\"terraform_locks\\" {\\n\\n"}\n2024-07-16T06:00:43.553Z\t\x1b[34mINFO\x1b[0m\t.rpk.managed-byoc\tcli/cli.go:204\tFinished apply\t{"provisioner": "redpanda-bootstrap"}\n2024-07-16T06:00:43.553Z\t\x1b[34mINFO\x1b[0m\t.rpk.managed-byoc\tcli/cli.go:194\tRunning apply\t{"provisioner": "redpanda-network"}\n2024-07-16T06:03:11.935Z\t\x1b[31mERROR\x1b[0m\t.rpk.managed-byoc\tcli/cli.go:197\tFailed to apply provisioner\t{"provisioner": "redpanda-network", "error": "failed running terraform apply: exit status 1\\n\\nError: creating EC2 VPC Endpoint (com.amazonaws.us-west-2.s3): VpcEndpointLimitExceeded: The maximum number of VPC endpoints has been reached.\\n\\tstatus code: 400, request id: 2393d5c9-32d3-4373-8689-278e02d5ddbb\\n\\n  with module.network[0].aws_vpc_endpoint.s3,\\n  on ../../modules/terraform-aws-redpanda-network/network.tf line 41, in resource \\"aws_vpc_endpoint\\" \\"s3\\":\\n  41: resource \\"aws_vpc_endpoint\\" \\"s3\\" {\\n\\n"}\nFailed to apply provisioners: failed to apply provisioner redpanda-network: failed running terraform apply: exit status 1\n\nError: creating EC2 VPC Endpoint (com.amazonaws.us-west-2.s3): VpcEndpointLimitExceeded: The maximum number of VPC endpoints has been reached.\n\tstatus code: 400, request id: 2393d5c9-32d3-4373-8689-278e02d5ddbb\n\n  with module.network[0].aws_vpc_endpoint.s3,\n  on ../../modules/terraform-aws-redpanda-network/network.tf line 41, in resource "aws_vpc_endpoint" "s3":\n  41: resource "aws_vpc_endpoint" "s3" {\n\n\n', 1, '')
Traceback (most recent call last):
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/tests/runner_client.py", line 177, in _do_run
    self.test = self.test_context.cls(self.test_context)
  File "/home/ubuntu/redpanda/tests/rptest/redpanda_cloud_tests/observe_test.py", line 17, in __init__
    super(HTObserveTest, self).__init__(test_context=test_context)
  File "/home/ubuntu/redpanda/tests/rptest/tests/redpanda_cloud_test.py", line 25, in __init__
    self.redpanda = make_redpanda_cloud_service(test_context)
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 5093, in make_redpanda_cloud_service
    return RedpandaServiceCloud(context,
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 1695, in __init__
    cluster_id = self._cloud_cluster.create(superuser=self._superuser)
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda_cloud.py", line 926, in create
    self._create_new_cluster()
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda_cloud.py", line 700, in _create_new_cluster
    self.utils.rpk_cloud_apply(_cluster_id)
  File "/home/ubuntu/redpanda/tests/rptest/services/cloud_cluster_utils.py", line 136, in rpk_cloud_apply
    out = self._exec(cmd, timeout=1800)
  File "/home/ubuntu/redpanda/tests/rptest/services/cloud_cluster_utils.py", line 107, in _exec
    return self.rpk._execute(cmd, env=self.env, timeout=timeout)
  File "/home/ubuntu/redpanda/tests/rptest/clients/rpk.py", line 1152, in _execute
    raise RpkException(
rptest.clients.rpk.RpkException: RpkException<command /opt/redpanda/bin/rpk cloud byoc aws apply --redpanda-id=cqb0oq8eee8cb0pjh1a0 -v returned 1, output: ; stderr: 06:00:16.066  DEBUG  logging in using client credential flow
06:00:16.066  DEBUG  Your existing auth token is still valid, avoiding re-authentication.
06:00:16.074  DEBUG  looking for existing byoc plugin  {"exists": false}
06:00:16.074  DEBUG  requesting GET https://cloud-api.ppd.cloud.redpanda.com/api/v1/clusters/cqb0oq8eee8cb0pjh1a0
06:00:16.284  DEBUG  got response for GET https://cloud-api.ppd.cloud.redpanda.com/api/v1/clusters/cqb0oq8eee8cb0pjh1a0: 200 OK
06:00:16.285  DEBUG  requesting GET https://cloud-api.ppd.cloud.redpanda.com/api/v1/clusters-resources/install-pack-versions/24.1.20240712161218
06:00:16.387  DEBUG  got response for GET https://cloud-api.ppd.cloud.redpanda.com/api/v1/clusters-resources/install-pack-versions/24.1.20240712161218: 200 OK
06:00:16.438  DEBUG  downloading byoc plugin  {"version": "24.1.20240712161218-sha256.73ec33b8290858510e75d171c0e6a5ac"}
06:00:16.438  DEBUG  requesting GET https://dl.redpanda.com/public/rpk-plugins-preprod/raw/names/byoc-linux-amd64/versions/24.1.20240712161218-sha256.73ec33b8290858510e75d171c0e6a5ac/byoc-linux-amd64-24.1.20240712161218-sha256.73ec33b8290858510e75d171c0e6a5ac.tar.gz
06:00:17.097  DEBUG  got response for GET https://dl.redpanda.com/public/rpk-plugins-preprod/raw/names/byoc-linux-amd64/versions/24.1.20240712161218-sha256.73ec33b8290858510e75d171c0e6a5ac/byoc-linux-amd64-24.1.20240712161218-sha256.73ec33b8290858510e75d171c0e6a5ac.tar.gz: 200 OK
06:00:18.297  DEBUG  writing byoc plugin to /home/ubuntu/.local/bin
2024-07-16T06:00:21.053Z    INFO   .rpk.managed-byoc   aws/apply.go:114    Reconciling agent infrastructure...
2024-07-16T06:00:21.416Z    INFO   .rpk.managed-byoc   cli/cli.go:194  Running apply   {"provisioner": "redpanda-bootstrap"}
2024-07-16T06:00:29.129Z    WARN   .rpk.managed-byoc.bootstrap aws/bootstrap.go:114    apply failed, attempting to re-construct local state    {"error": "failed running terraform apply: exit status 1\n\nError: creating Amazon S3 (Simple Storage) Bucket (rp-381492114165-us-west-2-mgmt): BucketAlreadyOwnedByYou: Your previous request to create the named bucket succeeded and you already own it.\n\tstatus code: 409, request id: 596SC7CRA9VXNM9Y, host id: 96olp6yizXCrp07ivNiZYmuqRJ6cjP7ZP7UqfqhhxL0RKFD7vsIRqOAUSZmojU0wznqeyUtuhfzrjRQ560eWnQ==\n\n  with aws_s3_bucket.management[0],\n  on main.tf line 35, in resource \"aws_s3_bucket\" \"management\":\n  35: resource \"aws_s3_bucket\" \"management\" {\n\n\nError: creating Amazon DynamoDB Table (rp-381492114165-us-west-2-mgmt-tflock): ResourceInUseException: Table already exists: rp-381492114165-us-west-2-mgmt-tflock\n\n  with aws_dynamodb_table.terraform_locks[0],\n  on main.tf line 91, in resource \"aws_dynamodb_table\" \"terraform_locks\":\n  91: resource \"aws_dynamodb_table\" \"terraform_locks\" {\n\n"}
2024-07-16T06:00:43.553Z    INFO   .rpk.managed-byoc   cli/cli.go:204  Finished apply  {"provisioner": "redpanda-bootstrap"}
2024-07-16T06:00:43.553Z    INFO   .rpk.managed-byoc   cli/cli.go:194  Running apply   {"provisioner": "redpanda-network"}
2024-07-16T06:03:11.935Z    ERROR  .rpk.managed-byoc   cli/cli.go:197  Failed to apply provisioner {"provisioner": "redpanda-network", "error": "failed running terraform apply: exit status 1\n\nError: creating EC2 VPC Endpoint (com.amazonaws.us-west-2.s3): VpcEndpointLimitExceeded: The maximum number of VPC endpoints has been reached.\n\tstatus code: 400, request id: 2393d5c9-32d3-4373-8689-278e02d5ddbb\n\n  with module.network[0].aws_vpc_endpoint.s3,\n  on ../../modules/terraform-aws-redpanda-network/network.tf line 41, in resource \"aws_vpc_endpoint\" \"s3\":\n  41: resource \"aws_vpc_endpoint\" \"s3\" {\n\n"}
Failed to apply provisioners: failed to apply provisioner redpanda-network: failed running terraform apply: exit status 1

Error: creating EC2 VPC Endpoint (com.amazonaws.us-west-2.s3): VpcEndpointLimitExceeded: The maximum number of VPC endpoints has been reached.
    status code: 400, request id: 2393d5c9-32d3-4373-8689-278e02d5ddbb

  with module.network[0].aws_vpc_endpoint.s3,
  on ../../modules/terraform-aws-redpanda-network/network.tf line 41, in resource "aws_vpc_endpoint" "s3":
  41: resource "aws_vpc_endpoint" "s3" {

; returncode: 1>

JIRA Link: CORE-5658

dotnwat commented 4 months ago

duplicate of https://github.com/redpanda-data/redpanda/issues/21439

vbotbuildovich commented 4 months ago

https://buildkite.com/redpanda/vtools/builds/15824 https://buildkite.com/redpanda/vtools/builds/15849

vbotbuildovich commented 4 months ago

https://buildkite.com/redpanda/vtools/builds/15922 https://buildkite.com/redpanda/vtools/builds/15923 *https://buildkite.com/redpanda/vtools/builds/15927

michael-redpanda commented 4 months ago

Automatically closing issue to match current state of CORE-5658

vbotbuildovich commented 4 months ago

*https://buildkite.com/redpanda/vtools/builds/15951

vbotbuildovich commented 3 months ago

https://buildkite.com/redpanda/vtools/builds/15975 https://buildkite.com/redpanda/vtools/builds/15976 https://buildkite.com/redpanda/vtools/builds/15999 https://buildkite.com/redpanda/vtools/builds/16012 https://buildkite.com/redpanda/vtools/builds/16025 https://buildkite.com/redpanda/vtools/builds/16051 *https://buildkite.com/redpanda/vtools/builds/16058

vbotbuildovich commented 3 months ago

https://buildkite.com/redpanda/vtools/builds/16073 https://buildkite.com/redpanda/vtools/builds/16072 *https://buildkite.com/redpanda/vtools/builds/16077

vbotbuildovich commented 3 months ago

https://buildkite.com/redpanda/vtools/builds/16095 https://buildkite.com/redpanda/vtools/builds/16096 *https://buildkite.com/redpanda/vtools/builds/16100

vbotbuildovich commented 3 months ago

https://buildkite.com/redpanda/vtools/builds/16113 https://buildkite.com/redpanda/vtools/builds/16114

vbotbuildovich commented 3 months ago

*https://buildkite.com/redpanda/vtools/builds/16142

vbotbuildovich commented 3 months ago

*https://buildkite.com/redpanda/vtools/builds/16165

vbotbuildovich commented 3 months ago

*https://buildkite.com/redpanda/vtools/builds/16179

vbotbuildovich commented 3 months ago

https://buildkite.com/redpanda/vtools/builds/16217 https://buildkite.com/redpanda/vtools/builds/16224

piyushredpanda commented 2 months ago

Closing older-bot-filed CI issues as we transition to a more reliable system.