Serverless AtlasDeployment on GCP fails in v2

LennartKoot commented 7 months ago

What did you do to encounter the bug? Steps to reproduce the behavior: Updated the operator from v1.9.2 to v2.0.1. All Serverless GCP AtlasDeployment failed to reconcile afterwards. Example spec:

apiVersion: atlas.mongodb.com/v1
kind: AtlasDeployment
metadata:
  name: some-name
spec:
  projectRef:
    name: some-name
  serverlessSpec:
    name: some-name
    providerSettings:
      providerName: SERVERLESS
      backingProviderName: GCP
      regionName: WESTERN_EUROPE

The AtlasDeployment CRDs get a new failed status:

Status:
  Conditions:
    Last Transition Time:  2024-02-13T09:01:06Z
    Status:                False
    Type:                  Ready
    Last Transition Time:  2023-09-29T15:55:09Z
    Status:                True
    Type:                  ResourceVersionIsValid
    Last Transition Time:  2023-09-29T15:55:09Z
    Status:                True
    Type:                  ValidationSucceeded
    Last Transition Time:  2024-02-13T09:01:06Z
    Message:               unable to resolve ownership for deletion protection: failed to list serverless private endpoints: GET https://cloud.mongodb.com/api/atlas/v1.0/groups/xxxx/privateEndpoint/serverless/instance/xxxx/endpoint: 400 (request "INVALID_CLOUD_PROVIDER") The specified instance xxxx belongs to a cloud provider that is not yet supported by this feature GCP.
    Reason:                InternalError
    Status:                False
    Type:                  DeploymentReady
    Last Transition Time:  2024-02-13T09:01:06Z
    Message:               unable to resolve ownership for deletion protection: failed to list serverless private endpoints: GET https://cloud.mongodb.com/api/atlas/v1.0/groups/xxxx/privateEndpoint/serverless/instance/xxxx/endpoint: 400 (request "INVALID_CLOUD_PROVIDER") The specified instance xxxx belongs to a cloud provider that is not yet supported by this feature GCP.
    Reason:                InternalError
    Status:                False
    Type:                  AlertConfigurationReady

What did you expect? We do not use private endpoints, so I expected the Serverless GCP instance to succesfully reconcile with the updated Atlas Operator version.

What happened instead? The operator failed to reconcile existing deployments.

Screenshots

Operator Information

v2.0.1

Kubernetes Cluster Information

AKS
1.26.3

Additional context Seems to be a reintroduction of #821

Seems like since this code block is now before the GCP check introduced in #822 this fails again for GCP Serverless instances: https://github.com/mongodb/mongodb-atlas-kubernetes/blob/2aeee6a6ae0677d32c343767b81b7db847a33137/pkg/controller/atlasdeployment/serverless_private_endpoint.go#L40-L46

roothorp commented 7 months ago

Hi @LennartKoot, thanks for the detailed issue! We are working on a fix for this which should be released soon as part of 2.1.0. Thanks!

josvazg commented 7 months ago

Hi @LennartKoot we were testing today the upcoming release against your reproduction sample above. The good news is the fix is confirmed. But we had to change the GCP regionName to CENTRAL_US because MongoDB 7.2 was not available in WESTERN_EUROPE. Just letting you know in case your real deployment might also need this tweak after the upgrade.

LennartKoot commented 7 months ago

Great and thanks for letting me know!

roothorp commented 6 months ago

Closing as 2.1.0 is released and this should be fixed - please let us know if you are still seeing this issue!

mongodb / mongodb-atlas-kubernetes

Serverless AtlasDeployment on GCP fails in v2 #1378

Screenshots