skyplane-project / skyplane

🔥 Blazing fast bulk data transfers between any cloud 🔥
https://skyplane.org
Apache License 2.0
999 stars 58 forks source link

Skyplane does not perform inter-cloud small file transfers #916

Open josephcolton opened 1 year ago

josephcolton commented 1 year ago

Describe the bug When I try to copy a small file (<=1GB) between two buckets in different clouds, it appears to be working and even asks for confirmation before a copy, but once I confirm the copy, it claims the Transfer is starting and ends the program.

To Reproduce Steps to reproduce the behavior (please include the full Skyplane command you ran):

  1. Copy a small file to a cloud bucket in AWS
  2. Run a Skyplane command to copy that file to a cloud bucket at GCP
  3. Wait until it ends
  4. See nothing happened

Expected behavior I expected to see the file copying and the time results. I also expected to see the file in the destination location.

Screenshots joseph@sonnet ~ $ skyplane cp s3://aws.x.ap-southeast-2/zero-1gb.data gs://gcp_x_northamerica-northeast1/zero-1gb.data


/ | | / /\ \ / / \ | / _ \ | \ | || | \ `--.| |/ / \ V /| |/ / | / /\ | | || | `--. \ \ \ / | /| | | || . ` || | /__/ / |\ \ | | | | | || | | || |\ || |_ __/_| _/ _/ _| _/_| |/_| \/____/

Logging to: /tmp/skyplane/transfer_logs/20230822_235228-9e0b5004/client.log Using Skyplane version 0.3.2 Will transfer objects from aws:ap-southeast-2 to gcp:northamerica-northeast1-a VMs to provision: 1x aws:ap-southeast-2, 1x gcp:northamerica-northeast1-a Estimated egress cost: $0.11/GB s3://aws.x.ap-southeast-2/zero-1gb.data => gs://gcp_x_northamerica-northeast1/zero-1gb.data (1.00GB) Continue? [Y/n]: y Transfer starting (Tip: Enable auto-confirmation with skyplane config set autoconfirm true) joseph@sonnet ~ $

Transfer client log joseph@sonnet ~ $ cat /tmp/skyplane/transfer_logs/20230822_235228-9e0b5004/client.log 23:52:28 [DEBUG] Using pipeline: <skyplane.api.pipeline.Pipeline object at 0x7f5ae1bd5270> 23:52:28 [DEBUG] [SkyplaneClient] Queued copy job CopyJob() 23:52:36 [DEBUG] Querying objects in s3://aws.x.ap-southeast-2 23:52:56 [DEBUG] User confirmed transfer 23:52:59 [DEBUG] Querying objects in s3://aws.x.ap-southeast-2

Environment info (please complete the following information):

Additional context My guess is that the replicate_small_cp_cmd method does not handle inter-cloud copy operations, only intra-cloud copy operations.