nanovms / nanos

A kernel designed to run one and only one application in a virtualized environment
https://nanos.org
Apache License 2.0
2.58k stars 133 forks source link

How to retrieve On-prem resource utilization #2040

Open leeyiding opened 1 month ago

leeyiding commented 1 month ago

Hello, I am currently trying to retrieve the resource utilization of on-premise QEMU instances, specifically focusing on CPU and memory usage.

  1. The results obtained using standard commands like ps are often inaccurate.
  2. I attempted to use QMP to get memory usage by executing the command {"execute": "query-balloon"}, but I encountered an error: {'error': {'class': 'DeviceNotActive', 'desc': 'No balloon device has been activated'}}.
  3. So far, I haven't found a reliable method to measure CPU usage.

Any assistance or guidance on how to accurately measure these resource utilizations would be greatly appreciated.

eyberg commented 1 month ago

is there a particular language you are using as most will have ways of getting at this info - in c something like

unsigned long mem_avail() {
  struct sysinfo info;

  if (sysinfo(&info) < 0)
    return 0;

  return info.freeram * info.mem_unit;
}

would work

leeyiding commented 1 month ago

Thanks for your reply. This code looks like it is executed in Nanos to get the system free memory. If I want to get Nanos related information in the host machine, similar to the metrics provided to the cloud platform, how can I do it?

eyberg commented 1 month ago

yes, you need to export your metrics through an agent; most people have a preferred apm vendor or the cloud metrics as you suggest, there is also radar which does memory && disk by default: https://docs.ops.city/ops/klibs#radar

we could look at adding some sort of optional flag or something for ops for local instances

eyberg commented 1 month ago

fyi, i added a WIP for mem metrics via balloon in ops here: https://github.com/nanovms/ops/tree/balloon

leeyiding commented 1 month ago

Looks good, is there any way to get CPU utilization?

eyberg commented 1 month ago

if asking about outside the guest you could use your normal tools but inside - i'm not quite sure what we would do there since it's single process and you aren't showing the share of utilization amongst everything else (such as ubuntu where you might have a few hundred processes running) - i know several apm vendors have thread utilization that you might look into

francescolavra commented 1 month ago

@leeyiding if you are looking for a host-side tool to measure CPU utilization that is more accurate than ps, I would suggest you give a try to the perf tool (https://perf.wiki.kernel.org/index.php/Tutorial): you can attach it to a Qemu process and trace the "instructions" events, whose count will give you a pretty accurate measure of how much CPU the VM instance uses.

leeyiding commented 1 month ago

@leeyiding if you are looking for a host-side tool to measure CPU utilization that is more accurate than ps, I would suggest you give a try to the perf tool (https://perf.wiki.kernel.org/index.php/Tutorial): you can attach it to a Qemu process and trace the "instructions" events, whose count will give you a pretty accurate measure of how much CPU the VM instance uses.

Thanks for the suggestion, I'll try it