dask / distributed

A distributed task scheduler for Dask
https://distributed.dask.org
BSD 3-Clause "New" or "Revised" License
1.58k stars 719 forks source link

Plugins that help to pass credentials for S3 and GCS to remote cluster workers #8883

Closed dbalabka closed 1 month ago

dbalabka commented 1 month ago

I didn't find a simple way to pass credentials to remote workers, such as S3 and GCS, while both are widely used to store data frames. In this ticket's scope, I propose creating plugins that will help distribute the required keys to remote workers.

GCP credentials GCP credentials file path is stored in GOOGLE_APPLICATION_CREDENTIALS env variable. The plugin has to create a remote file and pass an env variable with a proper path to workers.

S3 credentials Like GCP, we must update credential files and store them on each worker.

dbalabka commented 1 month ago

moved to https://github.com/dask/dask-cloudprovider/issues/438