Open romilbhardwaj opened 1 year ago
I will take on this issue
Seems like there are 4 categories where we can show the uploading/downloading status:
workdir
syncfile_mounts
file_mounts
COPY
modex_to_y()
functions under sky/data/data_transfer.py
I was thinking of starting with 1.Local to VM in the first PR and then move on to next tasks in later PRs.
What do you think on how the tasks are divided? Am I missing any items that should be included in the tasks above?
I ran into this today. The issue is regular file_mounts show a helpful log file, but storage doesn't.
Former:
I 03-15 17:32:09 cloud_vm_ray_backend.py:3396] To view detailed progress: tail -n100 -f ~/sky_logs/sky-2023-03-15-17-25-12-029820/file_mounts.log
I 03-15 17:32:09 backend_utils.py:1196] Syncing (to 1 node): /xxx -> ~/yyy
I can tail the log file and see what's up.
Latter:
I 03-15 17:47:44 storage.py:1358] Created GCS bucket xxx in US-CENTRAL1 with storage class STANDARD
⠏ Syncing /xxx to gs://yyy
I think exposing the underlying tool's stdout in such a log file will be a big UX improvement.
Bumping up the priority for this, it's important to give visibility into what's happening under the hood. User said:
Each time I launch a script, skypilot spends a few minutes on this syncing, even though I have not changed the dataset. (See image below) Is it copying data?
cc @landscapepainter
+1. It'd be helpful in that log file also includes the exact command being run, as the user wondered whether it's gsutil cp
or gsutil rsync
.
@romilbhardwaj @concretevitamin I'll try to resolve each feature one by one in separate PRs. I'm currently working on displaying a progress bar for files being synced during work_dir
and non-cloud file_mount
syncs(Local to VM). Mostly done, just need to brush up a bit.
Thanks! If the progress bar issues (multinode; overriding existing bar; etc) are not easy to fix, I think from a user perspective even having such info (what files being synced) in the log file will be very helpful.
Bumping this again - was uploading a big dir today and would've been useful to be able to just see the logs of the underlying gsutil
/aws s3 sync
command.
I'll go ahead and wrap this up adding the logs for now.
Bumping this... I'm often stuck at:
⠴ Syncing ~/mydata to gs://romil-test-bucket/
without any hints as to what is going on. Logs would be really nice to have here.
Note: from offline discussion with the team, it was concluded that a refactoring is necessary to support the logging for the upload. The refactoring includes migrating the sync process of local file_mounts to cloud storage to execution.py/_execute
. This is necessary to share the log path set when backend
is initialized at _execute
. Also, it is more ideal to keep the sync process in _execute
.
This issue is stale because it has been open 120 days with no activity. Remove stale label or comment or this will be closed in 10 days.
This issue is stale because it has been open 120 days with no activity. Remove stale label or comment or this will be closed in 10 days.
Paraphrased story from user: