Expand kubernetes_node and kubernetes_pod tables with cpu and memory information using a normalized unit

DavidGamba commented 1 month ago

Is your feature request related to a problem? Please describe.

I am trying to create queries to calculate node utilization by a given set of pods.

First problem, the cpu and memory fields are part of a jsonb object. For example:

node.allocatable->>'cpu'
node.allocatable->>'memory'
jsonb_path_query_array(pod.containers,'$[*].resources.requests.cpu')
jsonb_path_query_array(pod.containers,'$[*].resources.requests.memory')

Second problem, the cpu and memory fields are presented in different units so doing Math with them is hard.

For example cpu can be in milicores or full cores: 500m, 3 And memory can be in Ki, Mi, Gi: 125635460Ki, 200Mi, 10Gi

Describe the solution you'd like

Most of all I would like CPU and Memory to be presented with a normalized/standarized unit (without having the unit as part of the string) so that we can do math easily.

This feature request could extend to deployments, statefulsets and daemonsets too.

Second, I would like cpu and memory to be brought to a top level so it is easier to access, but this is a not as big a deal.

Describe alternatives you've considered

Current solution is to do super complicated queries.

e-gineer commented 1 month ago

Having a way to query standardized data makes sense.

If the original data was in columns, then I'd suggest adding extra columns with the standard unit - e.g. cpu and cpu_std or similar.

In this case it looks like the original data is buried down in the JSON received from the API? If so, it's heavier to add a new column for standardization. Adding an extra field to the JSON also feels weird because we're messing with the original / raw JSON result.

Suggestions?

DavidGamba commented 1 week ago

Wrote a blog post where I outline the two functions I am using, one for memory and one for CPU. https://www.davids-blog.gamba.ca/posts/steampipe-kubernetes.html

CPU I believe can be standarized to the smallest unit of milicores (rounding up to the next milicore). Memory makes sense to standarize to bytes (rounding up to a full byte). Memory also requires the reverse function so that it can be shown back in the compact way.

Since it seems that steampipe will not add support for functions then I propose exposing new columns.

For nodes, it is fairly easy to add new columns. The yaml for resources comes from the status field:

status:
  allocatable:
    attachable-volumes-aws-ebs: "25"
    cpu: 31750m
    ephemeral-storage: "93478772582"
    hugepages-1Gi: "0"
    hugepages-2Mi: "0"
    memory: 126208904Ki
    pods: "110"
  capacity:
    attachable-volumes-aws-ebs: "25"
    cpu: "32"
    ephemeral-storage: 101430960Ki
    hugepages-1Gi: "0"
    hugepages-2Mi: "0"
    memory: 130403208Ki
    pods: "110"

Steampipe already surfaces capacity and allocatable as a top level column. Should we add extra?

capacity_cpu
capacity_memory

Or just normalize the existing values.

For pods/deployments/statefulsets, it is a bit more challenging since the requests and limits are made per container in the pod. We should expose a list of the normalized values per container. That should be easy to sum if one wants to.

The same question applies, should we expose a high level containers_resources_requests and containers_resources_limits columns containing an array of normalized values per container or a column like containers_resources_requests_cpu with an array of cpu values per container.

This is the kubernetes documentation for resources: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/

turbot / steampipe-plugin-kubernetes

Expand kubernetes_node and kubernetes_pod tables with cpu and memory information using a normalized unit #242