neuro-inc / platform-docs

Apache License 2.0
0 stars 0 forks source link

Describe usage of platfrom blobs with jobs #20

Open andriihomiak opened 2 years ago

andriihomiak commented 2 years ago

On using platform blobs with jobs

Via neuro-cli

This is a very straightforward way to access blobs inside your platform jobs by passing your own credentials to neuro-cli inside. When starting a job, enable the --pass-config option - this will pass your credentials inside the job. Running neuro commands (including neuro blob) will use the passed credentials to act on behalf of your user. This approach, however, requires you to have Neuro CLI installed inside the container you are running.

Via Service Accounts and neuro-cli

This option allows to limit the access of the job to only a subset of resources (e.g. single platform bucket).

To do that, first create a service account:

neuro service-account create --name <service-account-name>

Output should look like this:

Id               service-account-b7477d2f-c614-41bb-8e2e-REDACTED 
 Name             blobs-guide-sa                                       
 Role             andriikhomiak/service-accounts/blobs-guide-sa        
 Owner            andriikhomiak                                        
 Default cluster  default                                              
 Created at       now                                                  

Full token with cluster and API url embedded (this value can be used as NEURO_PASSED_CONFIG environment variable):

eyJ0b2tlbiI6ICJleUowZVhBaU9pSktWMVFpTENKaGJHY2lPaUpJVXpJMU5pSjkuREDACTED9zdGFnaW5nLm5ldS5yby9hcGkvdjEifQ==

Just auth token (this value can be passed to neuro config login-with-token):

eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.REDACTED2NvdW50cy9ibG9icy1ndWlkZS1zYSJ9.uf0-PMeqNIiKMnREDACTEDme3aQJy_KqSRk

Save it to some secure place, you will be unable to retrieve it later!

Now create a secret with the value of full token section from the output above:

neuro secret add <secret-name> <FULL TOKEN HERE>

Next, grant access to your blob to this service account:

neuro acl grant blob:<blob-name-or-id> <your-username>/service-accounts/<service-account-name> write

Note: you may replace write with read or manage

Now run a job and mount the secret from above as a value of NEURO_PASSED_CONFIG env variable. Using neuro-cli, this would look like

neuro run <...> -e NEURO_PASSED_CONFIG=secret:<secret-name> <...> <IMAGE> <...>

Assuming you have neuro-cli availabe in your container, you can use neuro blob commands to access your blob.

Note: it is also possible to import blobs from cloud storage providers.

Via aws-cli

Note: This will only work with S3-compatible providers (e.g. AWS, MinIO)

To access blobs via aws-cli, we need to generate appropriate credentials for the bucket by running:

neuro blob mkcredentials <blob-id-or-name> --name <credentials-name>

To view the created credentials:

neuro blob statcredentials <credentials-name>

The output will look like this:

Id                                       bucket-credentials-REDACTED-2b62-4e8e-b0f4-867cffae19d0                        
 Name                                     blob-for-guide                                                                 
 Read-only:                               False                                                                          
 Credentials for bucket 'blob-for-guide'   Key                Value                                                      
                                           bucket_name        neuro-pl-REDACTED-andriikhomiak-blob-for-gu4568b0666c73  
                                           access_key_id      AKIA3REDACTED7XR                                       
                                           endpoint_url       https://s3.amazonaws.com                                   
                                           region_name        us-east-1                                                  
                                           secret_access_key  jF+by7WREDACTEDbPv2

To access this bucket via aws cli, set the value of appropriate env variables:

export AWS_ACCESS_KEY_ID=<access_key_id>
export AWS_ENPOINT_URL=<endpoint_url>
export AWS_DEFAULT_REGION=<region_name>
export AWS_SECRET_ACCESS_KEY=<secret_access_key>

Note: it is recommended to pass env credentials via secrets

Then run the necessary commands by specifying a proper bucket_name and endpoint, e.g.:

aws s3 ls <bucket_name> --endpoint-url=$AWS_ENPOINT_URL

Azure SAS

Note: this approach only works with Azure provider

Follow same steps as before to create crdentials for your bucket. This should provide you with the following:

 Id                                   bucket-credentials-7dc3b6a6-8ba9-4907-9532-REDACTED                                                        
 Name                                 blob-guide
 Read-only:                           False
 Credentials for bucket 'blob-guide'   Key               Value
                                        bucket_name       neuro-pl-REDACTED-andriikhomiak-blob-guidec12000615675
                                        sas_token         sv=2021-04-10&si=bkt-user-REDACTEDFvAamrft2WBWZ2Iiedsn4m
                                        storage_endpoint  https://bREDACTED1627f7541b3189.blob.core.windows.net

Tip: you might need to pipe the output of neuro blob statbucket to something like less to be able to see the entirety of sas_token

Now you can use anything, that supports Azure SAS endpoints. Below is an example of copying data from Azure storage via rclone:

export AZURE_BUCKET_NAME=<bucket_name>
export AZURE_SAS_TOKEN=<sas_token>
export AZURE_STORAGE_ENDPOINT=<sas_token>
rclone -v copyto --azureblob-sas-url '${AZURE_STORAGE_ENDPOINT}?${AZURE_SAS_TOKEN}' \
    :azureblob:/<bucket_name> /some/folder/

Note: usage of secrets for storing crdentials is recommened

Google Cloud Storage

Note: this approach only works with Google Cloud Storage provider

Follow the same steps as before to creaate credentials for your bucket. This should provide you with the following:


 Id                                   bucket-credentials-REDACTED-b343-4307-a420-66cd8cd21f60                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
 Name                                 blob-guide                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                
 Read-only:                           False                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
 Credentials for bucket 'blob-guide'   Key          Value                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
                                       bucket_name  neuro-pl-REDACTED-andriikhomiak-blob-guidee64dd6713f66                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
                                       key_data     ewogICJ0eXBlIjogInNlcnZpY2VfYWNjb3VudCIsCREDACTEDbkZxeDdSVmprN1FhK3gyMjlLa2thanBWYzdEazNWTnYxNVxuMHRIaVVnM1hhV0p6Q0pER2NaZ1pLaFBIOWZpdzVON0pRUjB1VlFGbFd0ODhzdmZTckxyY2ZNY1lnZ2JGSWVGU1xuNnoyR1VqT3BaV2hMT3FmR3dMWXNXbzBnZE1xaHZNbWxMTkp2VXNoUEVxKzdQZTZhYkJSZytkU1hBb0dCQ…  
                                       project      hse-project-REDACTED                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          

Now you can use the provided credentials with gsutil like so:

# decode the key into keyfile
echo <key_data> | base64 -d > /path/to/keyfile.json
# use the decoded key
gcloud auth activate-service-account --key-file=/path/to/keyfile.json
# gsutil now uses this key to allow access to bucket
gsutil ls gs://<bucket_name>/
andriihomiak commented 2 years ago

Above is a draft for blob usage guide, we still need to add GCS section (will need a running GCS provider, should be available next week).

andriihomiak commented 2 years ago

Added GCS section. @jane-gorlova guide above might need some proofreading, but otherwise pretty much covers everything related to blobs usage in jobs