Open thomas-delrue opened 1 year ago
Thanks for the detailed report! @thomas-delrue Yes that seems to be a bug, although I am not sure if we should categorize it as a bug in Config Connector.
Config Connector determines whether the underlying GCP resource supports label or not by looking at the labels
field, and the REST API doc [1] does show labels
as a top level field in cluster
resource, without mentioning it's different behavior for virtual cluster. We can probably look into a feature in Config Connector to disable adding labels to GCP resources. Meanwhile could you please open a GCP support case to Dataproc Cluster? I believe either the document needs to be updated, or the labels nee to be supported in the case of virtualCluster.
[1] https://cloud.google.com/dataproc/docs/reference/rest/v1/projects.regions.clusters
Thank you @diviner524 for your response.
Yes, it does seem to be undocumented behaviour at the DataProc Google API side. Our organization however is on Basic Support at the moment with GCP, so I'm not able to create a GCP support case for this, which is a bit unfortunate. Is there any other way we can raise the issue with DataProc?
The possibility to disable adding labels in Config Connector would be a helpful feature here.
Checklist
Bug Description
We're looking to create a DataProc Cluster on GKE with the help of ConfigConnector. But we're getting an
Update call failed: error applying desired state: googleapi: Error 400: User labels are not supported in Dataproc Virtual Cluster.
It looks as if ConfigConnector makes a POST request to the relevant endpoint at https://dataproc.googleapis.com/ for creating the DataProc Cluster, and adds itself some labels to the payload request, but the Google API rejects using labels in combination of virtualClusterConfig.
Is this then a bug in the current implementation of ConfigConnector, that one cannot successfully create a DataProc in GKE with it, or are we missing something in our configuration?
Additional Diagnostic Information
we're assuming that config-connector is adding the label 'managed-by-cnrm: true' to the Google API call, and that it's this that The Google API rejects in combination with the virtualClusterConfig.
When we do a curl request ourselves to DataProc Google API without 'labels', we get a DataProc cluster provisioned:
But when we do the same curl request but with labels added to the payload to the same endpoint, we get the 400 Error: User labels are not supported in Dataproc Virtual Cluster.
Kubernetes Cluster Version
v1.24.10-gke.2300
Config Connector Version
1.103.0
Config Connector Mode
cluster mode
Log Output
From
kubectl describe dataproccluster dev-data-platform-dataproc
Steps to reproduce the issue
kubectl apply -f dataproc-clusters.yaml
See the YAML manifest file below. Note, gcp project, staging bucket and gke cluster and node pool already existed, that's why 'external' is used throughout.
YAML snippets