dstackai / dstack

dstack is a lightweight, open-source alternative to Kubernetes & Slurm, simplifying AI container orchestration with multi-cloud & on-prem support. It natively supports NVIDIA, AMD, & TPU.
https://dstack.ai/docs
Mozilla Public License 2.0
1.6k stars 157 forks source link

[Feature] Track disk usage with `dstack stats` #1950

Open peterschmidt85 opened 4 weeks ago

peterschmidt85 commented 4 weeks ago

Problem: Disk space is one of the most important factors when working with large models. Available disk space may depend on what is already downloaded, on its cache, etc. As I user I'd like to see how much space is available to understand if more space must be requested or if I need to delete cache, etc.

Solution: Add DISK used/total todstack stats`.

r4victor commented 3 weeks ago

Note that getting container disk usage can be an expensive operation (e.g. simply calling docker ps --size can take minutes). So we need to look for an efficient mechanism to report disk usage. So here's, for example, Kubernetes (cadvisor) increased disk metrics interval to 1 minute: https://github.com/google/cadvisor/pull/910.