How to replicate CPU utilization plot from prometheus data?

Stealthmate commented 1 year ago

I'm trying to replicate the CPU utilization data example from the docs but I can't seem to get it to work correctly. First of all, I have the following Prometheus query:

irate(node_cpu_seconds_total{mode="user"}[1m]) * on(instance) group_left(nodeid) node_uname_info{nodepool="default"}

On grafana, when I click table view, I can see that there are multiple tables, where each table name is a set of labels (there are a lot of them, but for the defining ones are cpu and nodeid). For example if I have 10 nodes, each with 4 CPU cores, I get 40 different tables. Inside each table, there is a column called Time and column whose name is the same as the table name (a set of identifying labels).

Now I want to plot a mosaic, where each row corresponds to a cpu core inside a specific node, and group the rows by nodeid. The way I see it, this is the same as the docs example.

So on the UI I set the breakdown and group parameters accordingly:

But then I get a plot that contains only the data for a single group:

In this case the nodeid is 6001 and I see 4 rows, which correspond to the 4 cpu cores.

Am I doing something wrong, or is this a bug?

EDIT: accidentally posted before finishing writing, sorry.

boazreicher commented 1 year ago

Hi. Can you add an example of what your data looks like? (from the query inspector or the table view)

boazreicher commented 1 year ago

Hi @Stealthmate , I can't tell for sure without seeing your data, but i think i have an idea on what's the problem. Sierra Plot assumes that the label for each row (the Breakdown field) is unique. But in your case, the values of the Breakdown field (the cpu core label) is not. Only the combination of cpu+nodeid is unique.

You could get around it by modifying your query to rename the cpu labels. So, let's say your data looks something like this:

time,nodeid,amp,cpu
2022-01-01 02:00:00,A,1,c1
2022-01-01 02:00:00,A,1,c2
2022-01-01 02:01:00,A,1,c1
2022-01-01 02:01:00,A,1,c2
2022-01-01 02:00:00,B,1,c1
2022-01-01 02:00:00,B,1,c2
2022-01-01 02:01:00,B,1,c1
2022-01-01 02:01:00,B,1,c2

you would need to get it to look something like:

time,nodeid,amp,cpu
2022-01-01 02:00:00,A,1,Ac1
2022-01-01 02:00:00,A,1,Ac2
2022-01-01 02:01:00,A,1,Ac1
2022-01-01 02:01:00,A,1,Ac2
2022-01-01 02:00:00,B,1,Bc1
2022-01-01 02:00:00,B,1,Bc2
2022-01-01 02:01:00,B,1,Bc1
2022-01-01 02:01:00,B,1,Bc2

Stealthmate commented 1 year ago

@boazreicher Sorry for the late response! I changed my data to match your idea and it worked! Thank you!

For anyone who gets stuck on the same problem in the future:

I changed my query to:

label_join(
    irate(node_cpu_seconds_total{mode="user"}[1m]) * on(instance) group_left(nodeid) node_uname_info{nodepool="default"},
    "cpu_nodeid",
    ",",
    "cpu",
    "nodeid"
)

This makes it so that the final result has a label cpu_nodeid which looks like 4,1001 (cpu: 4, nodeid: 1001). Then I set the Breakdown field to cpu_nodeid and the Group field to nodeid, and I got the result I wanted.

Stealthmate commented 1 year ago

Closing this since I consider it resolved on my end!

boazreicher / mosaic-plot

How to replicate CPU utilization plot from prometheus data? #4