Resource tracking for executor VMs

eseliger commented 3 years ago

We at some point want to be able to track resource usage per VM, not just per executor compute instance. Therefore, we probably want to run a node_exporter inside the VM and scrape the data from there and forward it in some way. Ideally, this would not only be available in Prometheus/Grafana in the end, so we can show it to users as well. This is to drill down on performance problems to find out whether CPU or memory are the bottlenecks. Also, this will help us make more informed decisions about resource allocations.

malomarrec commented 3 years ago

Adding to this: from a batch change user standpoint, as soon as a user starts running large-scale complex jobs (think a AST-based tool that requires the JVM) over hundreds of repositories, they will want to know what is the bottleneck for execution speed (CPU, network, memory, etc).

Strum355 commented 3 years ago

Pasting this link here so we don't forget, we will likely need the pushgateway to capture firecracker VM metrics

github-actions[bot] commented 3 years ago

Heads up @macraig - the "team/code-intelligence" label was applied to this issue.

sourcegraph / sourcegraph-public-snapshot

Resource tracking for executor VMs #26361