kubeflow / katib

Automated Machine Learning on Kubernetes
https://www.kubeflow.org/docs/components/katib
Apache License 2.0
1.48k stars 439 forks source link

Katib launched pods don't have permission to access GCP buckets. #1588

Closed PatrickGhosn closed 3 years ago

PatrickGhosn commented 3 years ago

/kind bug

What steps did you take and what happened: Launched our training component through a trial template, and we got the following error:

File "/root/.cache/pypoetry/virtualenvs/cnn-sZ7XREjn-py3.8/lib/python3.8/site-packages/google/cloud/_http.py", line 483, in api_request raise exceptions.from_http_response(response) google.api_core.exceptions.Forbidden: 403 GET https://storage.googleapis.com/storage/v1/b/brightclue-mlops?projection=noAcl&prettyPrint=false: Caller does not have storage.buckets.get access to the Google Cloud Storage bucket.

What did you expect to happen: For the storage bucket to be accessible by the trial template pods.

Anything else you would like to add: Running the same training component directly from Kubeflow works fine.

Environment:

johnugeorge commented 3 years ago

Can you provide more details?

How did you provide credentials in trial template? If you have provided, is the case that pod template doesn't have them?

PatrickGhosn commented 3 years ago

Can you provide more details?

How did you provide credentials in trial template? If you have provided, is the case that pod template doesn't have them?

Hey. I managed to resolve it by adding the serviceAccount: default-editor and serviceAccountName: default-editor in the pod spec in the trial .