F5Networks / f5-bigip-runtime-init

Apache License 2.0
14 stars 15 forks source link

CFE declaration order will error Filtered bucket does not exist #41

Closed JeffGiroux closed 1 year ago

JeffGiroux commented 2 years ago

Using runtime to install tool chain and declarations. I usually do install order of the following: DO > AS3 > CFE

I usually do service declaration in the following order: DO > AS3 > CFE

I have found lately that I fail onboarding with 500 errors and bucket problems. I had many issues in the past with CFE and buckets and/or ordering of declarations...similar to past ticket here https://github.com/F5Networks/f5-bigip-runtime-init/issues/34. During this time, there are absolutely NO logs hitting the GCP cloud API...and this tells me that CFE is not even trying to reach out. I have used a known working GDM template to deploy failover BIG-IPs to see the difference in API logs...and my terraform repo with DO > AS3 > CFE declaration order does not produce any API hits for storage/buckets.

Workaround

You must apply the declarations in the following order: DO > CFE > AS3

As soon as I change the order...instant magic! I ran my terraform job again and 100% success onboarding for both big-ip nodes. I see API hits in the GCP logging for my service account IAM user too and bucket/storage hits.

Errors

Here's sample error...

Wed, 06 Apr 2022 16:40:23 GMT - finest: socket 441 opened
Wed, 06 Apr 2022 16:40:23 GMT - fine: [f5-cloud-failover] HTTP Request - GET /info
Wed, 06 Apr 2022 16:40:23 GMT - fine: [f5-cloud-failover] HTTP Request - POST /trigger
Wed, 06 Apr 2022 16:40:23 GMT - fine: [f5-cloud-failover] Performing failover - initialization
Wed, 06 Apr 2022 16:40:23 GMT - finest: [f5-cloud-failover] Device initialization complete
Wed, 06 Apr 2022 16:40:23 GMT - finest: [f5-cloud-failover] Fetched proxy settings: {"protocol":"http","host":"","port":"8080","username":"","password":""}
Wed, 06 Apr 2022 16:40:24 GMT - fine: [f5-cloud-failover] config: {"class":"Cloud_Failover","environment":"gcp","externalStorage":{"scopingTags":{"f5_cloud_failover_label":"giroux123"}},"failoverAddresses":{"enabled":true,"scopingTags":{"f5_cloud_failover_label":"giroux123"},"requireScopingTags":false},"failoverRoutes":{"enabled":true,"scopingTags":{"f5_cloud_failover_label":"giroux123"},"scopingAddressRanges":[{"range":"192.0.2.0/24"}],"defaultNextHopAddresses":{"discoveryType":"static","items":["10.1.10.19","10.1.10.104"]}},"controls":{"class":"Controls","logLevel":"silly"},"schemaVersion":"1.10.0"}
Wed, 06 Apr 2022 16:40:24 GMT - finest: [f5-cloud-failover] proxySettings: {"protocol":"http","host":"","port":"8080","username":"","password":""}
Wed, 06 Apr 2022 16:40:24 GMT - severe: [f5-cloud-failover] Failover initialization failed: Filtered bucket does not exist:  Error: Filtered bucket does not exist: 
    at storage.getBuckets.then.then (/var/config/rest/iapps/f5-cloud-failover/nodejs/providers/gcp/cloud.js:353:43)
    at tryCatcher (/usr/share/rest/node/node_modules/bluebird/js/release/util.js:16:23)
    at Promise._settlePromiseFromHandler (/usr/share/rest/node/node_modules/bluebird/js/release/promise.js:512:31)
    at Promise._settlePromise (/usr/share/rest/node/node_modules/bluebird/js/release/promise.js:569:18)
    at Promise._settlePromise0 (/usr/share/rest/node/node_modules/bluebird/js/release/promise.js:614:10)
    at Promise._settlePromises (/usr/share/rest/node/node_modules/bluebird/js/release/promise.js:693:18)
    at Async._drainQueue (/usr/share/rest/node/node_modules/bluebird/js/release/async.js:133:16)
    at Async._drainQueues (/usr/share/rest/node/node_modules/bluebird/js/release/async.js:143:10)
    at Immediate.Async.drainQueues (/usr/share/rest/node/node_modules/bluebird/js/release/async.js:17:14)
    at runCallback (timers.js:794:20)
    at tryOnImmediate (timers.js:752:5)
    at processImmediate [as _immediateCallback] (timers.js:729:5)
Wed, 06 Apr 2022 16:40:29 GMT - finest: socket 441 closed

Troubleshooting

I have performs TONS of tshooting to narrow this down. I have checked routes, permissions, everything looks good. I can hop on SSH on the BIG-IP and run the manual CLI command to prove the IAM user can see buckets.

{
  "kind": "storage#buckets",
  "items": [
    {
      "kind": "storage#bucket",
      "selfLink": "https://www.googleapis.com/storage/v1/b/girouxf5",
      "id": "girouxf5",
      "name": "girouxf5",
      "projectNumber": "690404916641",
      "metageneration": "1",
      "location": "US-WEST1",
      "storageClass": "STANDARD",
      "etag": "CAE=",
      "defaultEventBasedHold": false,
      "timeCreated": "2022-04-06T16:25:56.991Z",
      "updated": "2022-04-06T16:25:56.991Z",
      "labels": {
        "f5_cloud_failover_label": "giroux123"
      },
      "iamConfiguration": {
        "bucketPolicyOnly": {
          "enabled": true,
          "lockedTime": "2022-07-05T16:25:56.991Z"
        },
        "uniformBucketLevelAccess": {
          "enabled": true,
          "lockedTime": "2022-07-05T16:25:56.991Z"
        },
        "publicAccessPrevention": "inherited"
      },
      "locationType": "region",
      "satisfiesPZS": false
    }
  ]
}
shyawnkarim commented 2 years ago

Thanks for reporting and for providing these details. We'll take a look into either fixing the order issue or documenting the order requirement. Tracked with internal ID ESECLDTPLT-3078.

JeffGiroux commented 1 year ago

If you ever validate bucket permissions from the BIG-IP, you can ssh to BIG-IP and run this locally.

curl -H "Authorization: Bearer $(curl -sf --retry 20 -H 'Metadata-Flavor: Google' http://169.254.169.254/computeMetadata/v1/instance/service-accounts/default/token | jq --raw-output '.access_token')" https://storage.googleapis.com/storage/v1/b?project=$(curl -sf --retry 20 -H 'Metadata-Flavor: Google' http://169.254.169.254/computeMetadata/v1/project/project-id)
aliasgar215 commented 1 year ago

i am facing the above issue mentioned by you for one of my GCP cloud deployed vm. Here i am only using CFE but still getting the same error.So order of DO /AS3 is not applicable to my case. Also can you please help me edit below command as per my device data and execute. like which field i need to populate which data and then execute from F5 bash:-

curl -H "Authorization: Bearer $(curl -sf --retry 20 -H 'Metadata-Flavor: Google' http://169.254.169.254/computeMetadata/v1/instance/service-accounts/default/token | jq --raw-output '.access_token')" https://storage.googleapis.com/storage/v1/b?project=$(curl -sf --retry 20 -H 'Metadata-Flavor: Google' http://169.254.169.254/computeMetadata/v1/project/project-id)

shyawnkarim commented 1 year ago

Closing. A fix for this is not possible due to a race condition on BIG-IP. Please follow the examples in our documentation. They have been tested and work.