Closed murtaza98 closed 1 month ago
@murtaza98 Thanks for reporting this issue. I am looking into this issue.
The key error is {"error":"quota_exceeded","error_description":"[Security Token Service] The request was throttled due to rate limit. Please retry after a few seconds."
. This indicates that you have insufficient quota for Security Token Service. You need to increase the quota for Security Token Service in your GCP project.
Please find more details on https://cloud.google.com/iam/quotas#quotas . Let me know if you any further issues.
Thanks for looking into this issue @gargnitingoogle We'll work on increasing this quota and running another test next week. Will keep you posted about the findings
Wanted to add here that STS quota for a project is shared across all conccurrent GKE runs, GCSFuse mounts in that project at a time. As a rule of thumb, if your clusters would be doing maximum N GCSFuse mounts concurrently (across all pods and clusters), then you should set your STS quota to above 2N per minute.
Hey @murtaza98,
Has your issue been resolved? To help us debug the issue better, could you please provide the following details about the project:
Let us know if you have any other questions!
Thanks, Tulsi Shah
Hello all, Thanks for your help in this issue. The above suggestion on increasing the rate limit on STS service did indeed help us scale further than 1000 pods. We'd requested an increase in the STS request quota to 60k per minute (1k per sec). With the above limit, we were able to bring about 10k pods (Approx 1800 nodes); with each pod mounted to max 2 storage buckets. At this stage, we observed a slightly stable setup. There was a slight increase in our read and write operations performance to Google's storage mount.
As we scaled further, we encountered the STS request quota limit again, which impacted our ability to bring up new pods. While this is something we could address by contacting support to increase the quota, it wasn’t our primary concern.
The more critical issue we observed was a significant degradation in the performance of read and write operations on the Google storage mount. The read and write were now taking three times longer than before. Since there was no indication of throttling on the storage bucket due to rate limits, it suggests that the Fuse driver itself was not scaling effectively.
Based on the results, we’ve decided to pause the POC with GCS Fuse for now. We achieved better performance by manually handling file uploads and downloads within our application using storage APIs, which proved more scalable for our needs.
For feedback, I’d like to inquire if there are any guidelines for mounting GCS Fuse on pods at large scale, specifically at least 50k pods.
Additionally, I’d like to better understand why the Fuse driver is generating so many STS calls. While I understand that a pod will make an authorization call upon its initial spin-up, which would lead to STS requests when new pods are added to a cluster, in our tests we observed continuous API calls to STS even without scaling up, simply during read and write operations to the bucket. My current understanding is that STS is primarily for authorization, so why is the driver invoking it during read/write operations?
@Tulsishah Answering your questions
How many GKE clusters are there, and what are their versions?
One GKE cluster. Version - 1.29.7-gke.1104000
The exact cluster details where you're seeing the mount failures due to quota issues.
Please feel free to contact me by email if you require this info for further debugging: murtaza@hackerrank.com
Pod and volume information.
We'd 2 buckets: 1 with file caching enabled: Per pod ephemeral storage was set to 2Gi and fileCachingCapacity was set to 1Gi
2nd bucket: no file caching and ephemeral storage set to 2Gi
Note: we were running approx 6 pods per node; and we always had more than 20Gi free for node.
Hi @murtaza98 To potentially avoid the STS quota issue, could you try to add a new volume attribute to your PV spec? It will make the CSI driver skip unnecessary STS requests. We've fixed the issue in a newer GKE version without requiring this explicit volume attribute setting, but you don't have to upgrade your cluster in this POC. Using the volume attribute will achieve the same. Here is an example:
apiVersion: v1
kind: PersistentVolume
spec:
...
csi:
driver: gcsfuse.csi.storage.gke.io
volumeHandle: coderunner-testcases-staging
volumeAttributes:
gcsfuseLoggingSeverity: warning
fileCacheCapacity: "10Gi"
metadataCacheTTLSeconds: "600"
skipCSIBucketAccessCheck: "true"
The last line skipCSIBucketAccessCheck: "true"
is all you need to add.
Hey @murtaza98 Just want to check if skipCSIBucketAccessCheck: "true"
suggested by @songjiaxun helped reduce the STS usage and quota issue for you ?
Hello @sethiay
Sorry for the delayed response!
Unfortunately, we did not get a chance to run another load test with the above-requested change. As I mentioned above, we observed better results by consuming the cloud storage API for managing our file uploads and downloads directly within our POCs at our scale; hence, we're moving forward with that approach.
I'm closing this issue now, and I cannot validate the above change.
Describe the issue
When trying to mount a GCS bucket using the FUSE driver to more than 1000 pods within a GKE cluster, we get 429 errors within the fuse sidecar container, due to which we've pods stuck in
ContainerCreateError
state. I noticed a similar error which was previously reported and as per the suggestions there, we're running GKE version greater than1.29.3-gke.1093000
(a) Do you see this as part of mount failure? Or it comes after the successful mount in read operation? Yes, this is part of mount failures (b) Do you see the failure in all the pods or in few pods? Not all pods are affected, but a considerable number of pods get affected once u scale beyond 500 pods. (c) If possible, could you please provide the gcsfuse debug logs, of any failure pods? Please find the logs below
System & Version (please complete the following information):
Additional context
Logs:
Persistent Volume:
Traffic on Bucket: