skyplane-project / skyplane

🔥 Blazing fast bulk data transfers between any cloud 🔥
https://skyplane.org
Apache License 2.0
1.08k stars 62 forks source link

Azure Blob Storage to GCP Bucket transfer errors #628

Closed ajayra04 closed 1 year ago

ajayra04 commented 2 years ago

HI Team,

I am trying to test the migration connection from azure storage to gcp, In production we need to transfer more than 55TB of data.

We found skyplane to migrate the same super easy but while executing the commands, we are getting below errors.

~$ skyplane cp -r azure://testskyplaneazure/containazure gs://oradaostesr
 _____ _   ____   _______ _       ___   _   _  _____ 
/  ___| | / /\ \ / / ___ \ |     / _ \ | \ | ||  ___|
\ `--.| |/ /  \ V /| |_/ / |    / /_\ \|  \| || |__  
 `--. \    \   \ / |  __/| |    |  _  || . ` ||  __| 
/\__/ / |\  \  | | | |   | |____| | | || |\  || |___ 
\____/\_| \_/  \_/ \_|   \_____/\_| |_/\_| \_/\____/

02:05:24 [ERROR] Specified object does not exist.

Traceback (most recent call last):
  File "/home/ajay_rana/.local/lib/python3.9/site-packages/skyplane/cli/cli.py", line 
182, in cp
    transfer_pairs = generate_full_transferobjlist(
  File 
"/home/ajay_rana/.local/lib/python3.9/site-packages/skyplane/cli/cli_impl/cp_replicate.p
y", line 172, in generate_full_transferobjlist
    raise exceptions.MissingObjectException(f"No objects were found in the specified 
prefix {source_prefix} in {source_bucket}")
skyplane.exceptions.MissingObjectException: No objects were found in the specified 
prefix testskyplaneazure/containazure in testskyplaneazure/containazure

❌ MissingObjectException: No objects were found in the specified prefix 
testskyplaneazure/containazure in testskyplaneazure/containazure
Please ensure that the object exists and is accessible.
Screenshot 2022-10-21 at 7 39 32 AM Screenshot 2022-10-21 at 7 35 58 AM
parasj commented 2 years ago

@ajayra04 I'm investigating this now. Thank you for trying our project! This appears to be a permissions error where Skyplane can't access the container. Skyplane attempts to access your data with a User Managed Identity which by default has delegated permissions to read, write and query the storage accounts in the subscription you chose when you configured Skyplane.

During skyplane init --reinit-azure, we ask you which subscription you would like to configure:

Screen Shot 2022-10-21 at 10 33 53 AM

See the az role assignment create --role Contributor --assignee-object-id ... --assignee-principal-type ServicePrincipal --subscription ... line. The assignee-object-id is the client ID for the user managed identity skyplane_umi and the subscription is the subscription you would like to enable support for.

To try to fix this, ensure that the storage account is within the subscription that you authorized Skyplane to run in. You can reconfigure it using skyplane init --reinit-azure. Alternatively, we can manually grant skyplane_umi permissions to your storage account. Let me know if you'd like to set that up (where the Skyplane VMs run in one subscription but read/write to another subscription).

parasj commented 2 years ago

Also, @ajayra04 if you join our Slack (https://join.slack.com/t/skyplaneworkspace/shared_invite/zt-1cxmedcuc-GwIXLGyHTyOYELq7KoOl6Q) I can help debug with lower latency over DM.

parasj commented 2 years ago

@ajayra04 I just merged https://github.com/skyplane-project/skyplane/pull/632 which will support selecting multipel Azure subscriptions during skyplane init. I'd like to try it to see if it fixes the permission error.

Please install from source:

$ pip uninstall skyplane
$ git clone git@github.com:skyplane-project/skyplane.git
$ pip install -e ".[aws,azure,gcp]"

After this, please reinitialize Azure support:

$ skyplane init --reinit-azure

You will select the Azure subscription which you'd like to launch VMs in, and then you will select all subscriptions that Skyplane should have permissions to read + write to. Please ensure that the subscription that contains your storage account testskyplaneazure is selected.

ajayra04 commented 2 years ago

@parasj I am following all the suggested steps, still not working, Now I am getting permission authorisation error once again, by default provided the skyplane azure enteprise app all the permission required to execute this.

Is that possible can we connect on Meet/Teams call on this issues.

parasj commented 2 years ago

@ajayra04 Please send me an email at paras_jain [at] berkeley.edu and we can set up a call. I'm fairly flexible on timing.

sarahwooders commented 1 year ago

Closing since stale