ray-project / ray

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
32.68k stars 5.54k forks source link

[air] Artifact syncing doesn't work for ddp workers #34475

Open xwjiang2010 opened 1 year ago

xwjiang2010 commented 1 year ago

See https://discuss.ray.io/t/ray-tune-sync-config-not-syncing-logs-from-worker-node/10237

req:

cc @matthewdeng Does this make sense as part of checkpoint project? Seems like we are still scoping. Feel free to drop the tag if not relevant.

anyscalesam commented 1 month ago

@justinvyu can we close this?