Closed parasj closed 2 years ago
@abiswal2001 I was working on setting up Skyplane with @romilbhardwaj and we found a few places where recursive transfers were confusing to him. We match the semantics of aws s3 cp
exactly at the moment (e.g. non-recursive transfers will only copy one object, recursive transfers require a trailing slash on the source path). We found some edge cases listed above so we may want to add better documentation or better error messages.
One example of an error with recursive transfers:
$ skyplane sync --recursive s3://romil-sky-test s3://romil-dataset
_____ _ ____ _______ _ ___ _ _ _____
/ ___| | / /\ \ / / ___ \ | / _ \ | \ | || ___|
\ `--.| |/ / \ V /| |_/ / | / /_\ \| \| || |__
`--. \ \ \ / | __/| | | _ || . ` || __|
/\__/ / |\ \ | | | | | |____| | | || |\ || |___
\____/\_| \_/ \_/ \_| \_____/\_| |_/\_| \_/\____/
Traceback (most recent call last):
File "/mnt/d/wsl/anaconda3/lib/python3.7/site-packages/skyplane/cli/cli.py", line 250, in sync
src_region, bucket_src, path_src, dst_region, bucket_dst, path_dst, recursive=recursive
File "/mnt/d/wsl/anaconda3/lib/python3.7/site-packages/skyplane/cli/cli_impl/cp_replicate.py", line
142, in generate_full_transferobjlist
dest_key = map_object_key_prefix(source_prefix, source_obj.key, dest_prefix, recursive=recursive)
File "/mnt/d/wsl/anaconda3/lib/python3.7/site-packages/skyplane/cli/cli_impl/cp_replicate.py", line
106, in map_object_key_prefix
raise exceptions.MissingObjectException(f"Source key {source_key} does not start with source prefix {source_prefix}")
skyplane.exceptions.MissingObjectException: Source key sky.prof does not start with source prefix /
❌ MissingObjectException: Source key sky.prof does not start with source prefix /
Please ensure that the object exists and is accessible.
In this case, we can provide better error logging to directly suggest running the command as: skyplane sync --recursive s3://romil-sky-test/ s3://romil-dataset
. Alternatively can we assume a slash?
One place this is perhaps confusing is how AWS s3 cp treats prefixes versus paths. From the AWS s3 cp documentation:
Recursively copying S3 objects to a local directory
When passed with the parameter --recursive, the following cp command recursively copies all objects under a specified prefix and bucket to a specified directory. In this example, the bucket mybucket has the objects test1.txt and test2.txt:
aws s3 cp s3://mybucket . --recursive
Output:
download: s3://mybucket/test1.txt to test1.txt
download: s3://mybucket/test2.txt to test2.txt
The UX of recursive transfers is confusing: