Dispatch and distribute your ML training to "serverless" clusters in Python, like PyTorch for ML infra. Iterable, debuggable, multi-cloud/on-prem, identical across research and production.
Install a local telemetry agent receiver process on clusters, which is responsible for exporting various telemetry data (ex: spans, metrics) to the Runhouse backend collector
Install a local telemetry agent receiver process on clusters, which is responsible for exporting various telemetry data (ex: spans, metrics) to the Runhouse backend collector