Closed michaelitvin closed 3 years ago
Discord context: https://discordapp.com/channels/485586884165107732/485596304961962003/659335075422535691
Need to take a look at gsutil, for some reason it doesn't require gcloud beta auth application-default login
to work. Ideally we should behave the same.
Is there an official way to have DVC use a service account set up for it? Apologies - im new to both DVC and gcloud nuances.
Ive set up a service account, have the local (private) key file on disk, and can run gcloud auth activate-service-account
with my account name and key file, and verify my service account is listed in gcloud auth list
How to get DVC to respect / use that?
gcloud config set account <service account here>
may be the key?
Sorry to spam this thread
Ive verified that my service account is active via gcloud auth list
and I see an * next to my service account name.
Credentialed Accounts
ACTIVE ACCOUNT
* dvc-service-account@xxx.iam.gserviceaccount.com
vade@xxx
Running dvc push
gets me:
dvc push
0% Querying cache in gs://cinemanet-dataset/DVC| |0.00/70.6k [00:00<?, ?file/s]/usr/local/lib/dvc/google/auth/_default.py:69: UserWarning: Your application has authenticated using end user credentials from Google Cloud SDK. We recommend that most server applications use service accounts instead. If your application continues to use end user credentials from Cloud SDK, you might receive a "quota exceeded" or "API not enabled" error. For more information about service accounts, see https://cloud.google.com/docs/authentication/
/usr/local/lib/dvc/google/auth/_default.py:69: UserWarning: Your application has authenticated using end user credentials from Google Cloud SDK. We recommend that most server applications use service accounts instead. If your application continues to use end user credentials from Cloud SDK, you might receive a "quota exceeded" or "API not enabled" error. For more information about service accounts, see https://cloud.google.com/docs/authentication/
/usr/local/lib/dvc/google/auth/_default.py:69: UserWarning: Your application has authenticated using end user credentials from Google Cloud SDK. We recommend that most server applications use service accounts instead. If your application continues to use end user credentials from Cloud SDK, you might receive a "quota exceeded" or "API not enabled" error. For more information about service accounts, see https://cloud.google.com/docs/authentication/
I think this resolved it for me
export GOOGLE_APPLICATION_CREDENTIALS="/Path/to/my/keyfile.json"
where this is the JSON key file generated for the google service account.
Hopefully this monologue is helpful to someone!
@vade thanks! hat would be great to update docs. Let me know if you'd like to make a PR for that - I can help with that.
I'm down to help with that. Docs are so key for a projects success and making users lives easier. Let me confirm with a colleague this solution is working - (they have a touch more experience than I do with DVC) - if you don't hear from me in a day or two please reply - it's not you its me! 😂🤣
export GOOGLE_APPLICATION_CREDENTIALS="/Path/to/my/keyfile.json"
where this is the JSON key file generated for the google service account.
(@vade's colleague here) I can confirm this solution works.
Thanks, guys! It would be great to edit this file https://github.com/iterative/dvc.org/blob/master/public/static/docs/command-reference/remote/modify.md, the Click for Google Cloud Storage
section.
@vade thanks! Just a minor question in the PR for us to better understand the change. If you can share more info that would help. And a minor typo.
Totally, I responded with more info and a question of my own in the PR. LMK!
@vade I tried to "play" with this a little bit more ... could you please, try to use
dvc remote modify storage credentialpath /Path/to/my/keyfile.json
for the service account
where keyfile.json
is credentials file for the service account that has proper access.
that alone worked for me.
Though env variable should be fine also.
I'll review the docs PR and probably simplify a bit (to put links to the Google official auth docs mostly instead of replicating auth flow on our end).
@michaelitvin btw, do you remember how the gsutil
was installed? part of the SDK or with pip? Was you running it on your local machine or in the cloud?
Hey - gsutil is, weirdly, installed with the SDK. It typically runs local on the machine that is a DVC client (ie, pushing or pulling to the GCloud remote)
@vade did dvc remote modify storage credentialpath /Path/to/my/keyfile.json
work for you, btw? have you had time to try?
Hi! Another user reported a similar issue recently (DVC 1.6ish), with some additional problems and suggestions to ipmrove the UI and/or docs. See https://discord.com/channels/485586884165107732/485596304961962003/752939146430906419
Quick summary:
if I remove
credentialpath
from the config, successfully rungcloud init
, and rundvc push
, I'm back atCould not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS
(but that requires a service account setup). My intent when I started was to use GS + DVC with my user account credentials, not a service account.
BTW, existing documentation about this is spread in https://dvc.org/doc/user-guide/setup-google-drive-remote#using-service-accounts and https://dvc.org/doc/command-reference/remote/add#supported-storage-types and https://dvc.org/doc/command-reference/remote/modify#available-parameters-per-storage-type mainly.
And an additional question from the same user:
succeeded with a service account approach, but only after I set the role for it to "Owner". Feels like a sledgehammer approach. What are the minimum required role(s) for a GS bucket service account?
(Should this be a separate question issue?) Cc @mvshmakov maybe remembers 🙂
My 2cts: For someone used to S3 remotes, the GCP track is much more painful. It seems like using a GCP bucket is very different than one on S3.
I had the error below when I tried to push to a newly created GS remote, while regular gsutil
commands worked fine. gcloud auth login
didn't fix the problem, but gcloud beta auth application-default login
did.
Huge thanks to @pmrowla for helping me figure this out!
And an additional question from the same user:
succeeded with a service account approach, but only after I set the role for it to "Owner". Feels like a sledgehammer approach. What are the minimum required role(s) for a GS bucket service account?
(Should this be a separate question issue?) Cc @mvshmakov maybe remembers 🙂
Feels like @Suor can help with that. Sorry for such a late response, I've missed the notification.
I think this might be the normal behavior. If you'd like the auth you have done to be the default, you could use gcloud auth login --update-adc
.
With #5500, you are also able to supply your personal login info via credentialspath
. Like
dvc remote modify origin credentialpath ~/.config/gcloud/legacy_credentials/{your google account}/adc.json
Considering #5500 is merged now, the default behavior is now choosing the default credentials. Which can be set using the gcloud auth login --update-adc
. Or via specifying the credentialpath
.
Trying to run
dvc pull
with a Google Cloud remote, got this error message:gsutil ls gs://my_bucket
worked fine.Running
gcloud auth login
didn't help, butgcloud beta auth application-default login
solved the problem.DVC was installed using
sudo pip install dvc[all]
.