add local telemetry agent for clusters

run-house / runhouse

Dispatch and distribute your ML training to "serverless" clusters in Python, like PyTorch for ML infra. Iterable, debuggable, multi-cloud/on-prem, identical across research and production.

https://run.house

Apache License 2.0

977 stars 37 forks source link

add local telemetry agent for clusters #1210

Open jlewitt1 opened 2 months ago

jlewitt1 commented 2 months ago

Install a local telemetry agent receiver process on clusters, which is responsible for exporting various telemetry data (ex: spans, metrics) to the Runhouse backend collector

jlewitt1 commented 2 months ago

#1265
#1223
#1210 👈 (View in Graphite)
main

This stack of pull requests is managed by Graphite. Learn more about stacking.