Open d4l3k opened 2 years ago
As an alternative, I've been seeing people in open source leverage Rich to create a dashboard inside of a terminal https://www.willmcgugan.com/blog/tech/post/building-rich-terminal-dashboards/
So maybe you can reduce the burden of creating a dashboard and the maintenance by just having torchx list/logs
return rich
text which will be ideal for people doing cloud deployments as well
I've used it a bunch in side projects so let me know if you have any questions https://github.com/Textualize/rich. I'm pretty sure we can hack together a prototype in 1-2 days
Description
Add a new
torchx dashboard
command that will launch a local HTTP server that allows users to view all of their jobs with statuses, logs and integration with any ML specific extras such as artifacts, Tensorboard, etc.Motivation/Background
Currently the interface for TorchX is only via programmatic or via the CLI. It would also be nice to have a UI dashboard that could be used to monitor all of your job as well as support deeper integrations such as experiment tracking and metrics.
Right now if users want to use a UI they have to use their platform specific one (i.e aws batch/ray dashboard) and many don't have one (slurm/volcano).
Detailed Proposal
This would be a fairly simple interface built on top of something such as Flask (https://flask.palletsprojects.com/en/2.1.x/quickstart/).
Pages:
/
the main page with a list of all of the users jobs and filters/<scheduler>/<jobid>
an overview of the job, the job def and the status with a tab for logs, artifacts and any other URLs that are logged/<scheduler>/<jobid>/logs
- view the logs/<scheduler>/<jobid>/external/<metadata key>
- iframes based off of external services such as tensorboard etcAlternatives
Providing a way to view URLs for external services via the terminal.
Additional context/links