Open andriihomiak opened 2 years ago
Above is a draft for blob usage guide, we still need to add GCS section (will need a running GCS provider, should be available next week).
Added GCS section. @jane-gorlova guide above might need some proofreading, but otherwise pretty much covers everything related to blobs usage in jobs
On using platform blobs with jobs
Via
neuro-cli
This is a very straightforward way to access blobs inside your platform jobs by passing your own credentials to
neuro-cli
inside. When starting a job, enable the--pass-config
option - this will pass your credentials inside the job. Runningneuro
commands (includingneuro blob
) will use the passed credentials to act on behalf of your user. This approach, however, requires you to have Neuro CLI installed inside the container you are running.Via Service Accounts and
neuro-cli
This option allows to limit the access of the job to only a subset of resources (e.g. single platform bucket).
To do that, first create a service account:
Output should look like this:
Now create a secret with the value of full token section from the output above:
Next, grant access to your blob to this service account:
Now run a job and mount the secret from above as a value of
NEURO_PASSED_CONFIG
env variable. Using neuro-cli, this would look likeAssuming you have
neuro-cli
availabe in your container, you can useneuro blob
commands to access your blob.Via
aws-cli
To access blobs via
aws-cli
, we need to generate appropriate credentials for the bucket by running:To view the created credentials:
The output will look like this:
To access this bucket via aws cli, set the value of appropriate env variables:
Then run the necessary commands by specifying a proper bucket_name and endpoint, e.g.:
Azure SAS
Follow same steps as before to create crdentials for your bucket. This should provide you with the following:
Now you can use anything, that supports Azure SAS endpoints. Below is an example of copying data from Azure storage via
rclone
:Google Cloud Storage
Follow the same steps as before to creaate credentials for your bucket. This should provide you with the following:
Now you can use the provided credentials with gsutil like so: