GoogleCloudPlatform / k8s-config-connector

GCP Config Connector, a Kubernetes add-on for managing GCP resources
https://cloud.google.com/config-connector/docs/overview
Apache License 2.0
901 stars 235 forks source link

Creating DataprocCluster with a virtualClusterConfig (Dataproc on GKE) fails. GoogleAPIs give 400 #814

Open thomas-delrue opened 1 year ago

thomas-delrue commented 1 year ago

Checklist

Bug Description

We're looking to create a DataProc Cluster on GKE with the help of ConfigConnector. But we're getting an Update call failed: error applying desired state: googleapi: Error 400: User labels are not supported in Dataproc Virtual Cluster.

It looks as if ConfigConnector makes a POST request to the relevant endpoint at https://dataproc.googleapis.com/ for creating the DataProc Cluster, and adds itself some labels to the payload request, but the Google API rejects using labels in combination of virtualClusterConfig.

Is this then a bug in the current implementation of ConfigConnector, that one cannot successfully create a DataProc in GKE with it, or are we missing something in our configuration?

Additional Diagnostic Information

curl --request POST \
  '[https://dataproc.googleapis.com/v1/projects/<our project>/regions/<region>/clusters?key=[YOUR_API_KEY]'](https://dataproc.googleapis.com/v1/projects/<our project>/regions/<region>/clusters?key=[YOUR_API_KEY]%27) \
  --header 'Authorization: Bearer [YOUR_ACCESS_TOKEN]' \
  --header 'Accept: application/json' \
  --header 'Content-Type: application/json' \
  --data '{"projectId":"<our project>","clusterName":"<our cluster>","virtualClusterConfig":{"kubernetesClusterConfig":{"gkeClusterConfig":{"gkeClusterTarget":"projects/<our project>/locations/<region>/clusters/<cluster-name>","nodePoolTarget":[{"nodePool":"projects/<our project>/locations/<region>/clusters/<cluster-name>/nodePools/data-platform-dataproc","roles":["DEFAULT"]}]},"kubernetesNamespace":"<namespace name>","kubernetesSoftwareConfig":{"componentVersion":{"SPARK":"3.1-dataproc-13"}}},"stagingBucket":"<bucket-name>"},"labels":{"name":"wrench","mass":"1.3kg","count":"3"}}' \
  --compressed

Kubernetes Cluster Version

v1.24.10-gke.2300

Config Connector Version

1.103.0

Config Connector Mode

cluster mode

Log Output

From kubectl describe dataproccluster dev-data-platform-dataproc

Status:
  Conditions:
    Last Transition Time:  2023-05-12T12:49:34Z
    Message:               Update call failed: error applying desired state: googleapi: Error 400: User labels are not supported in Dataproc Virtual Cluster.
    Reason:                UpdateFailed
    Status:                False
    Type:                  Ready
  Observed Generation:     1
Events:
  Type     Reason        Age                   From                        Message
  ----     ------        ----                  ----                        -------
  Warning  UpdateFailed  42m (x14 over 58m)    dataproccluster-controller  Update call failed: error applying desired state: googleapi: Error 400: User labels are not supported in Dataproc Virtual Cluster.
  Normal   Updating      2m15s (x34 over 58m)  dataproccluster-controller  Update in progress

Steps to reproduce the issue

kubectl apply -f dataproc-clusters.yaml

See the YAML manifest file below. Note, gcp project, staging bucket and gke cluster and node pool already existed, that's why 'external' is used throughout.

YAML snippets

apiVersion: dataproc.cnrm.cloud.google.com/v1beta1
kind: DataprocCluster
metadata:
  annotations:
    cnrm.cloud.google.com/management-conflict-prevention-policy: "none"
  name: dev-data-platform-dataproc
spec:
  projectRef:
    external: some-project
  location: some-region
  virtualClusterConfig:
    stagingBucketRef:
      external: some-staging-bucket
    kubernetesClusterConfig:
      gkeClusterConfig:
        gkeClusterTargetRef:
          external: projects/some-project/locations/some-region/clusters/some-project-cluster
        nodePoolTarget:
        - nodePoolRef:
            external: projects/some-project/locations/some-region/clusters/some-project-cluster/nodePools/some-project-cluster-nodepool
          roles:
          - DEFAULT
      kubernetesNamespace: some-namespace
      kubernetesSoftwareConfig:
        componentVersion:
          SPARK: 3.1-dataproc-13
diviner524 commented 1 year ago

Thanks for the detailed report! @thomas-delrue Yes that seems to be a bug, although I am not sure if we should categorize it as a bug in Config Connector.

Config Connector determines whether the underlying GCP resource supports label or not by looking at the labels field, and the REST API doc [1] does show labels as a top level field in cluster resource, without mentioning it's different behavior for virtual cluster. We can probably look into a feature in Config Connector to disable adding labels to GCP resources. Meanwhile could you please open a GCP support case to Dataproc Cluster? I believe either the document needs to be updated, or the labels nee to be supported in the case of virtualCluster.

[1] https://cloud.google.com/dataproc/docs/reference/rest/v1/projects.regions.clusters

thomas-delrue commented 1 year ago

Thank you @diviner524 for your response.

Yes, it does seem to be undocumented behaviour at the DataProc Google API side. Our organization however is on Basic Support at the moment with GCP, so I'm not able to create a GCP support case for this, which is a bit unfortunate. Is there any other way we can raise the issue with DataProc?

The possibility to disable adding labels in Config Connector would be a helpful feature here.