Closed efiop closed 2 years ago
So, there are two separate things:
Moving Auth (when we create )logic itself from the dvc_objects
to to pydrive2.fs
. Most likely it will provide only small benefits to this library users (e.g. pass credentials by an env var which is not supported now here). Can simplify code in dvc_objects
. This is primarily cosmetic. And fair amount of code will stay in DVC anyways - depending on remote config it will have to pass different configuration to fs
. Not sure it worth it.
The actual problem in that ticket with tmp_dir
and making it optional. Either I don't completely understand the suggestion or it won't be possible. We must write somewhere these credentials. gdrivefs
is also doing this I'm pretty sure (or it has a terrible UX by default - asking users to open their browsers every time). This location is already configurable in DVC via gdrive_user_credentials_file
. And it can be even set globally as a workaround for dvc get
\ dvc import
workflow. Or credential can be passed by an ENV variable now.
Not sure this way this ticket is actionable. I feel we need to better understand the credentials flow in DVC itself first for dvc get
scenarios. E.g. we should agree that will be writing into appdir
not into repo.tmp_dir
by default. Clearly it will have certain implications (project will affect each other).
Right now auth is complex for the fs user (e.g. I really don't want to be forced to generate GoogleAuth
myself). Some friendly user-facing config options in __init__
is what we want for simple cases.
gdrivefs
is using pydata conventions and stashes a cached creds file there, which is limiting, but at least it is human-friendly. We need to do similar thing in pydrive2fs
, but be a bit more general, maybe. I don't feel comfortable using pydata-google-auth
directly, as it seems to be a bit dead and poorly maintained (e.g. red CI) and strictly speaking pydrive2
has nothing to do with pydata anyways, so it would be werid depending on it.
We are currently creating auth ourselves in https://github.com/iterative/dvc-objects/blob/main/src/dvc_objects/fs/implementations/gdrive.py , which is very messy and still relies on a convention of keeping the creds files in
tmp_dir
provided by dvc. It would be great to simplify the authentication and not requiretmp_dir
(make it optional) and make it about as easy to use as gdrivefs https://github.com/fsspec/gdrivefs/blob/083996b503194424d11772570077fffdab758377/gdrivefs/core.py#L44