kubecost / cost-analyzer-helm-chart

Kubecost helm chart
http://kubecost.com/install
Apache License 2.0
489 stars 419 forks source link

Missing "Data ingress" and "Data egress" B/s on `/abandoned-workloads` #1917

Closed mclean0328 closed 1 year ago

mclean0328 commented 1 year ago

Describe the bug

When on the /abandoned-workloads view, all statefulset workloads always report ‘Data ingress’ and ‘Data egress’ to have ‘0 B/s’ despite there being significant network traffic.

To Reproduce

Steps to reproduce the behavior:

  1. Savings→’Remedy abandoned workloads’ report
  2. Select Statefulset workload with known network activity
  3. View ‘Data ingress’ and ‘Data egress’ in workload information

Compare this against the network costs of the same workload found on the ‘Allocations’ page and you will see there are network costs.

Expected behavior

For workloads that are known to have network activity, there should be reported X B/s for ‘Data ingress’ and ‘Data egress’.

Screenshots

┆Issue is synchronized with this Jira Task by Unito

AjayTripathy commented 1 year ago

Hi @mclean0328 could you share the kubecost version you're on? I have seen this before but I believe this got fixed in a later version.

mclean0328 commented 1 year ago

@AjayTripathy just checked and version is v1.99.0

teevans commented 1 year ago

Tagging @nealormsbee - I know we just updated this page, would the new API calls fix this by chance?

nealormsbee commented 1 year ago

Unfortunately no -- the API update was for orphaned resources. We'll look at how we present data for this page -- maybe there's an obvious issue. I may need to come back and ask for reproduction details.

teevans commented 1 year ago

I'm a moron - My fault.

dwbrown2 commented 1 year ago

Propose this is P1, but let me know if others are feeling differently

wolfeaustin commented 1 year ago

Checked this out and wasn't able to reproduce an issue. From a code perspective, we're just taking this query http://nightly.kubecost.io:9090/model/savings/abandonedWorkloads?days=2&threshold=500 and rendering the respective egressBytesPerSecond ingressBytesPerSecond directly from the payload. Maybe there's a larger backend situation afoot?

Screenshot 2023-02-01 at 1 06 28 PM
Adam-Stack-PM commented 1 year ago

@teevans, @wolfeaustin, Should I route to core team to research?

teevans commented 1 year ago

Yessir!

nikovacevic commented 1 year ago

At a glance, no issues here. So I'll need an environment with an active repro.

@mclean0328 can you provide that? And ideally the values.yaml for it? (Is it kc-demo-stage as the screenshot indicates?)

AjayTripathy commented 1 year ago

I have a theory here...is it possible that we're getting network costs via reconciliation and peanut butter spreading them to a workload with no actual traffic? We should normalize reconciliation by number of bytes consumed by the pod ... @Sean-Holcomb would be the expert here.

nikovacevic commented 1 year ago

Yeah, I think that might be right, Ajay. I was barking up the same tree, wanting to see the values.yaml. But if network daemonset is enabled, and is working... I have no thoughts. Would have to just dig in and investigate where along the chain things are dropping out.

nikovacevic commented 1 year ago

Although, wait... this is not network daemonset related.

queryFmtNetReceiveBytes = `sum(increase(container_network_receive_bytes_total{pod!="", container="POD"}[%s])) by (pod_name, pod, namespace, %s)`
queryFmtNetTransferBytes = `sum(increase(container_network_transmit_bytes_total{pod!="", container="POD"}[%s])) by (pod_name, pod, namespace, %s)`
AjayTripathy commented 1 year ago

RIght. But I think check the code... reconciliation will apply cost regardless of whether or not there is traffic whereas it probably should spread costs proportionally based on those numbers.

Sean-Holcomb commented 1 year ago

been I while but network cost are distributed by weight if there are traffic numbers and evenly if the network pod is not set up.

AjayTripathy commented 1 year ago

So yeah, that sounds like exactly the issue^

Can we do one better here and go:

  1. weight on traffic numbers from network costs pod if they exist
  2. weight on the raw byte numbers if no network costs pod instead of evenly?
nikovacevic commented 1 year ago

Yeah, agree with Ajay. This would be great to revisit w/r/t reconciliation-from-Cloud-Costs work. (We can also fix in the current implementation, but might be nice to not duplicate work.)

nikovacevic commented 1 year ago

And to clarify, "raw byte numbers" come from cAdvisor's container_network_... metrics, which we should always have. If those don't exist, then maybe we need a third fallback, but that should be rare.

nikovacevic commented 1 year ago

One confounding issue here is that the network daemonset is, apparently, enabled on this cluster. So that cuts against this hypothesis and line of reasoning. But I've not been able to connect to the cluster to find out more. All I know is that we think these are the values.yaml used: https://github.com/kubecost/se-demo/blob/main/kc-demo/aws-kc-demo-prod-primary/kc-demo-prod-helm-values.yaml

teevans commented 1 year ago

@nikovacevic - THis feels like we need to punt. Tentatively marking for 1.101.

rossfisherkc commented 1 year ago

Customer responded (Enterprise customer)

"Thanks for getting back. We are facing serious issue due to incorrect forecasting values. Our business needs to understand how much we could saving in in Q2 on our kubernetes workload. We cannot provide that as Kubecost dashboard figures may incorrect. This will have an impact to our financial goals. Appreciate an ETA on the fix"

cc @teevans If we have any update we can provide to customers

teevans commented 1 year ago

@nikovacevic / @AjayTripathy - What can I do to help us figure this one out?

rossfisherkc commented 1 year ago

@teevans I spoke with Ajay on this one; how about I broker a call with the customer and we can get someone from engineering on the call? Who would be a good pick?

AjayTripathy commented 1 year ago

This needs to be broken into two parts.

  1. Confirm that the hypothesis (those pods actually are egressing/ingressing 0bytes and network reconciliation is allocating them a cost share anyway since we spread the cost of a node across all pods on that node evenly) is what's causing the discrepancy for the user. Explain that issue clearly and document. We shouldn't need someone on engineering to do this; just find a few pods with 0B ingress/egress and run the above prometheus queries https://github.com/kubecost/cost-analyzer-helm-chart/issues/1917#issuecomment-1414971515 with the correct podnames templated in and double check that they are actually zero or close to zero.

  2. Migrate network reconciliation from a peanut-butter spread per pod on a node to a proportional split based on byte egress. That's for engineering to do, but we should confirm that's actually the issue with #1 before doing that.

AjayTripathy commented 1 year ago

@rossfisherkc were we able to get these steps out to the user to confirm the hypothesis?

ivankube commented 1 year ago

It seems like pods with traffic are showing 0 for ingress and ingress in the Abandoned Workloads:

az aks get-credentials --resource-group khandkcost --name khand-dev-1

Screenshot 2023-03-03 at 2 51 08 PM Screenshot 2023-03-03 at 2 55 20 PM Screenshot 2023-03-03 at 2 58 55 PM

Customer confirmed the same in ZD #3524. I'm waiting for them run Prometheus queries.

rossfisherkc commented 1 year ago

I'm working with the customer on writing some API queries Here is what they were able to share:


`container_network_receive_bytes_total` 
``` { 
beta_kubernetes_io_arch: "amd64", 
beta_kubernetes_io_instance_type: "custom-16-32768", 
beta_kubernetes_io_os: "linux", 
business: "REDACTED", 
cloud_google_com_gke_boot_disk: "pd-standard", 
cloud_google_com_gke_container_runtime: "containerd", 
cloud_google_com_gke_cpu_scaling_level: "16", 
cloud_google_com_gke_max_pods_per_node: "110", 
cloud_google_com_gke_nodepool: "REDACTED", 
cloud_google_com_gke_os_distribution: "cos", 
cloud_google_com_machine_family: "custom-16", 
cluster_name: "REDACTED", 
env: "production", 
failure_domain_beta_kubernetes_io_region: "REDACTED", 
failure_domain_beta_kubernetes_io_zone: "REDACTED", 
id: "/kubepods/REDACTED", 
image: "< REDACTED >", 
instance: "REDACTED", 
interface: "eth0", 
job: "kubernetes-nodes-cadvisor", 
kubernetes_io_arch: "amd64", 
kubernetes_io_hostname: "REDACTED", 
kubernetes_io_os: "linux", 
name: "REDACTED", 
namespace: "production", 
node_kubernetes_io_instance_type: "custom-16-32768", 
pod: "cms-worker-REDACTED", 
pod_name: "cms-worker-REDACTED", 
team: "platform", 
topology_kubernetes_io_region: "REDACTED", 
topology_kubernetes_io_zone: "REDACTED", 
}``` 
`container_network_transmit_bytes_total` 
``` { 
beta_kubernetes_io_arch: "amd64", 
beta_kubernetes_io_instance_type: "custom-16-32768", 
beta_kubernetes_io_os: "linux", 
business: "REDACTED", 
cloud_google_com_gke_boot_disk: "pd-standard", 
cloud_google_com_gke_container_runtime: "containerd", 
cloud_google_com_gke_cpu_scaling_level: "16", 
cloud_google_com_gke_max_pods_per_node: "110", 
cloud_google_com_gke_nodepool: "REDACTED", 
cloud_google_com_gke_os_distribution: "cos", 
cloud_google_com_machine_family: "custom-16", 
cluster_name: "REDACTED", 
env: "production", 
failure_domain_beta_kubernetes_io_region: "us-REDACTED", 
failure_domain_beta_kubernetes_io_zone: "us-REDACTED", 
id: "/kubepods/burstable/REDACTED", 
image: "<http://k8s.gcr.io/REDACTED >", 
instance: "REDACTED", 
interface: "eth0", 
job: "kubernetes-nodes-cadvisor", 
kubernetes_io_arch: "amd64", 
kubernetes_io_hostname: "REDACTED", 
kubernetes_io_os: "linux", 
name: "REDACTED", 
namespace: "production", 
node_kubernetes_io_instance_type: "custom-16-32768", 
pod: "REDACTED", 
pod_name: "REDACTED", 
team: "platform", 
topology_kubernetes_io_region: "us-REDACTED", 
topology_kubernetes_io_zone: "us-REDACTED-a", 
}``````
AjayTripathy commented 1 year ago

Ok-- @ivankube @rossfisherkc the next step would be to see what the allocations API says for the same time period.

/model/allocation?aggregate=node&window=7d&accumulate=true&shareIdle=false&splitIdle=false&idleByNode=false&shareTenancyCosts=true&shareNamespaces=&shareCost=NaN&shareSplit=weighted&chartType=1&costMetric=1&startIndex=0&maxResults=0&req=1677803551437

Something like this.

ivankube commented 1 year ago

http://ivan.kubecost.xyz/model/allocation?aggregate=node&window=7d&accumulate=true&shareIdle=false&splitIdle=false&idleByNode=false&shareTenancyCosts=true&shareNamespaces=&shareCost=NaN&shareSplit=weighted&chartType=1&costMetric=1&startIndex=0&maxResults=0&req=1677803551437:


{"code":200,"data":[{"":{"name":"","properties":{"cluster":"cluster-one","container":"__unmounted__"},"window":{"start":"2023-03-08T00:00:00Z","end":"2023-03-15T00:00:00Z"},"start":"2023-03-08T00:00:00Z","end":"2023-03-14T15:00:00Z","minutes":9540,"cpuCores":0,"cpuCoreRequestAverage":0,"cpuCoreUsageAverage":0,"cpuCoreHours":0,"cpuCost":0,"cpuCostAdjustment":0,"cpuEfficiency":0,"gpuCount":0,"gpuHours":0,"gpuCost":0,"gpuCostAdjustment":0,"networkTransferBytes":0,"networkReceiveBytes":0,"networkCost":0,"networkCrossZoneCost":0,"networkCrossRegionCost":0,"networkInternetCost":0,"networkCostAdjustment":0,"loadBalancerCost":0,"loadBalancerCostAdjustment":0,"pvBytes":36521247671.17561,"pvByteHours":5806878379716.92,"pvCost":0.29633,"pvs":{"cluster=cluster-one:name=pvc-2bd8a964-1bd7-4ed9-a7b7-c61874e954a8":{"byteHours":426192908603.0767,"cost":0.02174920947692307},"cluster=cluster-one:name=pvc-3230089c-d9ad-4944-91e7-3da6cf9caf48":{"byteHours":106548227150.76918,"cost":0.005437302369230767},"cluster=cluster-one:name=pvc-702d44f2-d686-44c0-a71c-e7fe6896b94d":{"byteHours":1331852839384.6147,"cost":0.06796627961538462},"cluster=cluster-one:name=pvc-732044fd-2f39-4335-846a-5bfe40a2d4c0":{"byteHours":426192908603.0767,"cost":0.02174920947692307},"cluster=cluster-one:name=pvc-af6e32a9-fe29-476e-8bfa-992b4d4bc397":{"byteHours":426192908603.0767,"cost":0.02174920947692307},"cluster=cluster-one:name=pvc-d9236d99-0d44-4095-acce-c20dd6b6f569":{"byteHours":1331852839384.6147,"cost":0.06796627961538462},"cluster=cluster-one:name=pvc-d957a2f4-b3d4-49f0-b971-40f368d6ab76":{"byteHours":1331852839384.6147,"cost":0.06796627961538462},"cluster=cluster-one:name=pvc-fd096da6-5a94-48e0-8147-434872d73e50":{"byteHours":426192908603.0767,"cost":0.02174920947692307}},"pvCostAdjustment":0.20522,"ramBytes":0,"ramByteRequestAverage":0,"ramByteUsageAverage":0,"ramByteHours":0,"ramCost":0,"ramCostAdjustment":0,"ramEfficiency":0,"externalCost":0,"sharedCost":0,"totalCost":0.50155,"totalEfficiency":0,"rawAllocationOnly":null},"__idle__":{"name":"__idle__","properties":{"cluster":"cluster-one"},"window":{"start":"2023-03-08T00:00:00Z","end":"2023-03-15T00:00:00Z"},"start":"2023-03-08T00:00:00Z","end":"2023-03-14T12:00:00Z","minutes":9360,"cpuCores":0,"cpuCoreRequestAverage":0,"cpuCoreUsageAverage":0,"cpuCoreHours":0,"cpuCost":26.6091,"cpuCostAdjustment":0,"cpuEfficiency":0,"gpuCount":0,"gpuHours":0,"gpuCost":0,"gpuCostAdjustment":0,"networkTransferBytes":0,"networkReceiveBytes":0,"networkCost":0,"networkCrossZoneCost":0,"networkCrossRegionCost":0,"networkInternetCost":0,"networkCostAdjustment":0,"loadBalancerCost":0,"loadBalancerCostAdjustment":0,"pvBytes":0,"pvByteHours":0,"pvCost":0,"pvs":null,"pvCostAdjustment":0,"ramBytes":0,"ramByteRequestAverage":0,"ramByteUsageAverage":0,"ramByteHours":0,"ramCost":12.27403,"ramCostAdjustment":0,"ramEfficiency":0,"externalCost":0,"sharedCost":0,"totalCost":38.88312,"totalEfficiency":0,"rawAllocationOnly":null},"aks-agentpool-41384482-vmss000000":{"name":"aks-agentpool-41384482-vmss000000","properties":{"cluster":"cluster-one","node":"aks-agentpool-41384482-vmss000000","providerID":"azure:///subscriptions/0bd50fdf-c923-4e1e-850c-196dd3dcc5d3/resourceGroups/mc_khandkcost_khand-dev-1_eastus/providers/Microsoft.Compute/virtualMachineScaleSets/aks-agentpool-41384482-vmss/virtualMachines/0"},"window":{"start":"2023-03-08T00:00:00Z","end":"2023-03-15T00:00:00Z"},"start":"2023-03-08T00:00:00Z","end":"2023-03-14T14:15:00Z","minutes":9495,"cpuCores":1.20213,"cpuCoreRequestAverage":1.11413,"cpuCoreUsageAverage":0.16433,"cpuCoreHours":190.23736,"cpuCost":5.18073,"cpuCostAdjustment":0.00011,"cpuEfficiency":0.14749,"gpuCount":0,"gpuHours":0,"gpuCost":0,"gpuCostAdjustment":0,"networkTransferBytes":0,"networkReceiveBytes":0,"networkCost":0.80541,"networkCrossZoneCost":0,"networkCrossRegionCost":0,"networkInternetCost":0.80541,"networkCostAdjustment":0,"loadBalancerCost":0,"loadBalancerCostAdjustment":0,"pvBytes":103048682161.13744,"pvByteHours":16307453952000,"pvCost":0.83219,"pvs":{"cluster=cluster-one:name=pvc-3230089c-d9ad-4944-91e7-3da6cf9caf48":{"byteHours":1254419534769.231,"cost":0.06401475173076923},"cluster=cluster-one:name=pvc-732044fd-2f39-4335-846a-5bfe40a2d4c0":{"byteHours":5017678139076.924,"cost":0.2560590069230769},"cluster=cluster-one:name=pvc-af6e32a9-fe29-476e-8bfa-992b4d4bc397":{"byteHours":5017678139076.922,"cost":0.25605900692307687},"cluster=cluster-one:name=pvc-fd096da6-5a94-48e0-8147-434872d73e50":{"byteHours":5017678139076.924,"cost":0.2560590069230769}},"pvCostAdjustment":0.61239,"ramBytes":5096269863.01574,"ramByteRequestAverage":3696243652.14534,"ramByteUsageAverage":2052567556.36228,"ramByteHours":806484705822.2408,"ramCost":2.74151,"ramCostAdjustment":0.00006,"ramEfficiency":0.55531,"externalCost":0,"sharedCost":0.0106,"totalCost":10.18301,"totalEfficiency":0.28862,"rawAllocationOnly":null},"aks-agentpool-41384482-vmss000001":{"name":"aks-agentpool-41384482-vmss000001","properties":{"cluster":"cluster-one","node":"aks-agentpool-41384482-vmss000001","providerID":"azure:///subscriptions/0bd50fdf-c923-4e1e-850c-196dd3dcc5d3/resourceGroups/mc_khandkcost_khand-dev-1_eastus/providers/Microsoft.Compute/virtualMachineScaleSets/aks-agentpool-41384482-vmss/virtualMachines/1"},"window":{"start":"2023-03-08T00:00:00Z","end":"2023-03-15T00:00:00Z"},"start":"2023-03-08T00:00:00Z","end":"2023-03-14T14:15:00Z","minutes":9495,"cpuCores":0.50991,"cpuCoreRequestAverage":0.48024,"cpuCoreUsageAverage":0.08298,"cpuCoreHours":80.69293,"cpuCost":2.19751,"cpuCostAdjustment":0.00004,"cpuEfficiency":0.1728,"gpuCount":0,"gpuHours":0,"gpuCost":0,"gpuCostAdjustment":0,"networkTransferBytes":0,"networkReceiveBytes":0,"networkCost":3060.37968,"networkCrossZoneCost":0,"networkCrossRegionCost":0,"networkInternetCost":3060.37968,"networkCostAdjustment":0,"loadBalancerCost":0.8575,"loadBalancerCostAdjustment":-0.05,"pvBytes":328963100745.16956,"pvByteHours":52058410692923.08,"pvCost":2.65661,"pvs":{"cluster=cluster-one:name=pvc-2bd8a964-1bd7-4ed9-a7b7-c61874e954a8":{"byteHours":5017678139076.924,"cost":0.2560590069230769},"cluster=cluster-one:name=pvc-702d44f2-d686-44c0-a71c-e7fe6896b94d":{"byteHours":15680244184615.385,"cost":0.800184396634615},"cluster=cluster-one:name=pvc-d9236d99-0d44-4095-acce-c20dd6b6f569":{"byteHours":15680244184615.385,"cost":0.800184396634615},"cluster=cluster-one:name=pvc-d957a2f4-b3d4-49f0-b971-40f368d6ab76":{"byteHours":15680244184615.385,"cost":0.800184396634615}},"pvCostAdjustment":1.85024,"ramBytes":5183689897.47478,"ramByteRequestAverage":4735645302.36124,"ramByteUsageAverage":966905161.63903,"ramByteHours":820318926275.3838,"ramCost":2.78853,"ramCostAdjustment":0.00006,"ramEfficiency":0.20418,"externalCost":0,"sharedCost":3.19477,"totalCost":3073.87495,"totalEfficiency":0.19035,"rawAllocationOnly":null}}]}
jessegoodier commented 1 year ago

making this readable

{
    "code": 200,
    "data": [
        {
            "": {
                "name": "",
                "properties": {
                    "cluster": "cluster-one",
                    "container": "__unmounted__"
                },
                "window": {
                    "start": "2023-03-08T00:00:00Z",
                    "end": "2023-03-15T00:00:00Z"
                },
                "start": "2023-03-08T00:00:00Z",
                "end": "2023-03-14T15:00:00Z",
                "minutes": 9540,
                "cpuCores": 0,
                "cpuCoreRequestAverage": 0,
                "cpuCoreUsageAverage": 0,
                "cpuCoreHours": 0,
                "cpuCost": 0,
                "cpuCostAdjustment": 0,
                "cpuEfficiency": 0,
                "gpuCount": 0,
                "gpuHours": 0,
                "gpuCost": 0,
                "gpuCostAdjustment": 0,
                "networkTransferBytes": 0,
                "networkReceiveBytes": 0,
                "networkCost": 0,
                "networkCrossZoneCost": 0,
                "networkCrossRegionCost": 0,
                "networkInternetCost": 0,
                "networkCostAdjustment": 0,
                "loadBalancerCost": 0,
                "loadBalancerCostAdjustment": 0,
                "pvBytes": 36521247671.17561,
                "pvByteHours": 5806878379716.92,
                "pvCost": 0.29633,
                "pvs": {
                    "cluster=cluster-one:name=pvc-2bd8a964-1bd7-4ed9-a7b7-c61874e954a8": {
                        "byteHours": 426192908603.0767,
                        "cost": 0.02174920947692307
                    },
                    "cluster=cluster-one:name=pvc-3230089c-d9ad-4944-91e7-3da6cf9caf48": {
                        "byteHours": 106548227150.76918,
                        "cost": 0.005437302369230767
                    },
                    "cluster=cluster-one:name=pvc-702d44f2-d686-44c0-a71c-e7fe6896b94d": {
                        "byteHours": 1331852839384.6147,
                        "cost": 0.06796627961538462
                    },
                    "cluster=cluster-one:name=pvc-732044fd-2f39-4335-846a-5bfe40a2d4c0": {
                        "byteHours": 426192908603.0767,
                        "cost": 0.02174920947692307
                    },
                    "cluster=cluster-one:name=pvc-af6e32a9-fe29-476e-8bfa-992b4d4bc397": {
                        "byteHours": 426192908603.0767,
                        "cost": 0.02174920947692307
                    },
                    "cluster=cluster-one:name=pvc-d9236d99-0d44-4095-acce-c20dd6b6f569": {
                        "byteHours": 1331852839384.6147,
                        "cost": 0.06796627961538462
                    },
                    "cluster=cluster-one:name=pvc-d957a2f4-b3d4-49f0-b971-40f368d6ab76": {
                        "byteHours": 1331852839384.6147,
                        "cost": 0.06796627961538462
                    },
                    "cluster=cluster-one:name=pvc-fd096da6-5a94-48e0-8147-434872d73e50": {
                        "byteHours": 426192908603.0767,
                        "cost": 0.02174920947692307
                    }
                },
                "pvCostAdjustment": 0.20522,
                "ramBytes": 0,
                "ramByteRequestAverage": 0,
                "ramByteUsageAverage": 0,
                "ramByteHours": 0,
                "ramCost": 0,
                "ramCostAdjustment": 0,
                "ramEfficiency": 0,
                "externalCost": 0,
                "sharedCost": 0,
                "totalCost": 0.50155,
                "totalEfficiency": 0,
                "rawAllocationOnly": null
            },
            "__idle__": {
                "name": "__idle__",
                "properties": {
                    "cluster": "cluster-one"
                },
                "window": {
                    "start": "2023-03-08T00:00:00Z",
                    "end": "2023-03-15T00:00:00Z"
                },
                "start": "2023-03-08T00:00:00Z",
                "end": "2023-03-14T12:00:00Z",
                "minutes": 9360,
                "cpuCores": 0,
                "cpuCoreRequestAverage": 0,
                "cpuCoreUsageAverage": 0,
                "cpuCoreHours": 0,
                "cpuCost": 26.6091,
                "cpuCostAdjustment": 0,
                "cpuEfficiency": 0,
                "gpuCount": 0,
                "gpuHours": 0,
                "gpuCost": 0,
                "gpuCostAdjustment": 0,
                "networkTransferBytes": 0,
                "networkReceiveBytes": 0,
                "networkCost": 0,
                "networkCrossZoneCost": 0,
                "networkCrossRegionCost": 0,
                "networkInternetCost": 0,
                "networkCostAdjustment": 0,
                "loadBalancerCost": 0,
                "loadBalancerCostAdjustment": 0,
                "pvBytes": 0,
                "pvByteHours": 0,
                "pvCost": 0,
                "pvs": null,
                "pvCostAdjustment": 0,
                "ramBytes": 0,
                "ramByteRequestAverage": 0,
                "ramByteUsageAverage": 0,
                "ramByteHours": 0,
                "ramCost": 12.27403,
                "ramCostAdjustment": 0,
                "ramEfficiency": 0,
                "externalCost": 0,
                "sharedCost": 0,
                "totalCost": 38.88312,
                "totalEfficiency": 0,
                "rawAllocationOnly": null
            },
            "aks-agentpool-41384482-vmss000000": {
                "name": "aks-agentpool-41384482-vmss000000",
                "properties": {
                    "cluster": "cluster-one",
                    "node": "aks-agentpool-41384482-vmss000000",
                    "providerID": "azure:///subscriptions/0bd50fdf-c923-4e1e-850c-196dd3dcc5d3/resourceGroups/mc_khandkcost_khand-dev-1_eastus/providers/Microsoft.Compute/virtualMachineScaleSets/aks-agentpool-41384482-vmss/virtualMachines/0"
                },
                "window": {
                    "start": "2023-03-08T00:00:00Z",
                    "end": "2023-03-15T00:00:00Z"
                },
                "start": "2023-03-08T00:00:00Z",
                "end": "2023-03-14T14:15:00Z",
                "minutes": 9495,
                "cpuCores": 1.20213,
                "cpuCoreRequestAverage": 1.11413,
                "cpuCoreUsageAverage": 0.16433,
                "cpuCoreHours": 190.23736,
                "cpuCost": 5.18073,
                "cpuCostAdjustment": 0.00011,
                "cpuEfficiency": 0.14749,
                "gpuCount": 0,
                "gpuHours": 0,
                "gpuCost": 0,
                "gpuCostAdjustment": 0,
                "networkTransferBytes": 0,
                "networkReceiveBytes": 0,
                "networkCost": 0.80541,
                "networkCrossZoneCost": 0,
                "networkCrossRegionCost": 0,
                "networkInternetCost": 0.80541,
                "networkCostAdjustment": 0,
                "loadBalancerCost": 0,
                "loadBalancerCostAdjustment": 0,
                "pvBytes": 103048682161.13744,
                "pvByteHours": 16307453952000,
                "pvCost": 0.83219,
                "pvs": {
                    "cluster=cluster-one:name=pvc-3230089c-d9ad-4944-91e7-3da6cf9caf48": {
                        "byteHours": 1254419534769.231,
                        "cost": 0.06401475173076923
                    },
                    "cluster=cluster-one:name=pvc-732044fd-2f39-4335-846a-5bfe40a2d4c0": {
                        "byteHours": 5017678139076.924,
                        "cost": 0.2560590069230769
                    },
                    "cluster=cluster-one:name=pvc-af6e32a9-fe29-476e-8bfa-992b4d4bc397": {
                        "byteHours": 5017678139076.922,
                        "cost": 0.25605900692307687
                    },
                    "cluster=cluster-one:name=pvc-fd096da6-5a94-48e0-8147-434872d73e50": {
                        "byteHours": 5017678139076.924,
                        "cost": 0.2560590069230769
                    }
                },
                "pvCostAdjustment": 0.61239,
                "ramBytes": 5096269863.01574,
                "ramByteRequestAverage": 3696243652.14534,
                "ramByteUsageAverage": 2052567556.36228,
                "ramByteHours": 806484705822.2408,
                "ramCost": 2.74151,
                "ramCostAdjustment": 0.00006,
                "ramEfficiency": 0.55531,
                "externalCost": 0,
                "sharedCost": 0.0106,
                "totalCost": 10.18301,
                "totalEfficiency": 0.28862,
                "rawAllocationOnly": null
            },
            "aks-agentpool-41384482-vmss000001": {
                "name": "aks-agentpool-41384482-vmss000001",
                "properties": {
                    "cluster": "cluster-one",
                    "node": "aks-agentpool-41384482-vmss000001",
                    "providerID": "azure:///subscriptions/0bd50fdf-c923-4e1e-850c-196dd3dcc5d3/resourceGroups/mc_khandkcost_khand-dev-1_eastus/providers/Microsoft.Compute/virtualMachineScaleSets/aks-agentpool-41384482-vmss/virtualMachines/1"
                },
                "window": {
                    "start": "2023-03-08T00:00:00Z",
                    "end": "2023-03-15T00:00:00Z"
                },
                "start": "2023-03-08T00:00:00Z",
                "end": "2023-03-14T14:15:00Z",
                "minutes": 9495,
                "cpuCores": 0.50991,
                "cpuCoreRequestAverage": 0.48024,
                "cpuCoreUsageAverage": 0.08298,
                "cpuCoreHours": 80.69293,
                "cpuCost": 2.19751,
                "cpuCostAdjustment": 0.00004,
                "cpuEfficiency": 0.1728,
                "gpuCount": 0,
                "gpuHours": 0,
                "gpuCost": 0,
                "gpuCostAdjustment": 0,
                "networkTransferBytes": 0,
                "networkReceiveBytes": 0,
                "networkCost": 3060.37968,
                "networkCrossZoneCost": 0,
                "networkCrossRegionCost": 0,
                "networkInternetCost": 3060.37968,
                "networkCostAdjustment": 0,
                "loadBalancerCost": 0.8575,
                "loadBalancerCostAdjustment": -0.05,
                "pvBytes": 328963100745.16956,
                "pvByteHours": 52058410692923.08,
                "pvCost": 2.65661,
                "pvs": {
                    "cluster=cluster-one:name=pvc-2bd8a964-1bd7-4ed9-a7b7-c61874e954a8": {
                        "byteHours": 5017678139076.924,
                        "cost": 0.2560590069230769
                    },
                    "cluster=cluster-one:name=pvc-702d44f2-d686-44c0-a71c-e7fe6896b94d": {
                        "byteHours": 15680244184615.385,
                        "cost": 0.800184396634615
                    },
                    "cluster=cluster-one:name=pvc-d9236d99-0d44-4095-acce-c20dd6b6f569": {
                        "byteHours": 15680244184615.385,
                        "cost": 0.800184396634615
                    },
                    "cluster=cluster-one:name=pvc-d957a2f4-b3d4-49f0-b971-40f368d6ab76": {
                        "byteHours": 15680244184615.385,
                        "cost": 0.800184396634615
                    }
                },
                "pvCostAdjustment": 1.85024,
                "ramBytes": 5183689897.47478,
                "ramByteRequestAverage": 4735645302.36124,
                "ramByteUsageAverage": 966905161.63903,
                "ramByteHours": 820318926275.3838,
                "ramCost": 2.78853,
                "ramCostAdjustment": 0.00006,
                "ramEfficiency": 0.20418,
                "externalCost": 0,
                "sharedCost": 3.19477,
                "totalCost": 3073.87495,
                "totalEfficiency": 0.19035,
                "rawAllocationOnly": null
            }
        }
    ]
}
AjayTripathy commented 1 year ago

@ivankube can you remove the aggregate by node and instead do aggregate by pod filtered down to the pod that they're looking for, and make sure we get data for that pod?

ivankube commented 1 year ago

@AjayTripathy This is my test cluster.

http://ivan.kubecost.xyz/model/allocation?aggregate=pod&filterPods=kubecost-network-costs-jwnbt&window=7d&accumulate=true&shareIdle=false&splitIdle=false&idleByNode=false&shareTenancyCosts=true&shareNamespaces=&shareCost=NaN&shareSplit=weighted&chartType=1&costMetric=1&startIndex=0&maxResults=0&req=1677803551437:



  "code": 200,
  "data": [
    {
      "__idle__": {
        "name": "__idle__",
        "properties": {
          "cluster": "cluster-one"
        },
        "window": {
          "start": "2023-03-09T00:00:00Z",
          "end": "2023-03-16T00:00:00Z"
        },
        "start": "2023-03-09T00:00:00Z",
        "end": "2023-03-15T19:40:00Z",
        "minutes": 9820,
        "cpuCores": 0,
        "cpuCoreRequestAverage": 0,
        "cpuCoreUsageAverage": 0,
        "cpuCoreHours": 0,
        "cpuCost": 0.88731,
        "cpuCostAdjustment": 0,
        "cpuEfficiency": 0,
        "gpuCount": 0,
        "gpuHours": 0,
        "gpuCost": 0,
        "gpuCostAdjustment": 0,
        "networkTransferBytes": 0,
        "networkReceiveBytes": 0,
        "networkCost": 0,
        "networkCrossZoneCost": 0,
        "networkCrossRegionCost": 0,
        "networkInternetCost": 0,
        "networkCostAdjustment": 0,
        "loadBalancerCost": 0,
        "loadBalancerCostAdjustment": 0,
        "pvBytes": 0,
        "pvByteHours": 0,
        "pvCost": 0,
        "pvs": null,
        "pvCostAdjustment": 0,
        "ramBytes": 0,
        "ramByteRequestAverage": 0,
        "ramByteUsageAverage": 0,
        "ramByteHours": 0,
        "ramCost": 0.0592,
        "ramCostAdjustment": 0,
        "ramEfficiency": 0,
        "externalCost": 0,
        "sharedCost": 0,
        "totalCost": 0.94651,
        "totalEfficiency": 0,
        "rawAllocationOnly": null
      },
      "kubecost-network-costs-jwnbt": {
        "name": "kubecost-network-costs-jwnbt",
        "properties": {
          "cluster": "cluster-one",
          "node": "aks-agentpool-41384482-vmss000001",
          "container": "kubecost-network-costs",
          "controller": "kubecost-network-costs",
          "controllerKind": "daemonset",
          "namespace": "kubecost",
          "pod": "kubecost-network-costs-jwnbt",
          "providerID": "azure:///subscriptions/0bd50fdf-c923-4e1e-850c-196dd3dcc5d3/resourceGroups/mc_khandkcost_khand-dev-1_eastus/providers/Microsoft.Compute/virtualMachineScaleSets/aks-agentpool-41384482-vmss/virtualMachines/1"
        },
        "window": {
          "start": "2023-03-09T00:00:00Z",
          "end": "2023-03-16T00:00:00Z"
        },
        "start": "2023-03-09T00:00:00Z",
        "end": "2023-03-14T20:20:00Z",
        "minutes": 8420,
        "cpuCores": 0.0517,
        "cpuCoreRequestAverage": 0.05,
        "cpuCoreUsageAverage": 0.04053,
        "cpuCoreHours": 7.25514,
        "cpuCost": 0.19758,
        "cpuCostAdjustment": 0,
        "cpuEfficiency": 0.81058,
        "gpuCount": 0,
        "gpuHours": 0,
        "gpuCost": 0,
        "gpuCostAdjustment": 0,
        "networkTransferBytes": 0,
        "networkReceiveBytes": 0,
        "networkCost": 339.43089,
        "networkCrossZoneCost": 0,
        "networkCrossRegionCost": 0,
        "networkInternetCost": 339.43089,
        "networkCostAdjustment": 0,
        "loadBalancerCost": 0,
        "loadBalancerCostAdjustment": 0,
        "pvBytes": 0,
        "pvByteHours": 0,
        "pvCost": 0,
        "pvs": null,
        "pvCostAdjustment": 0,
        "ramBytes": 41132970.55903,
        "ramByteRequestAverage": 20971520,
        "ramByteUsageAverage": 44044950.76353,
        "ramByteHours": 5772326868.45058,
        "ramCost": 0.01962,
        "ramCostAdjustment": 0,
        "ramEfficiency": 2.10023,
        "externalCost": 0,
        "sharedCost": 0.33305,
        "totalCost": 339.98114,
        "totalEfficiency": 0.92708,
        "rawAllocationOnly": null
      }
    }
  ]
}

http://ivan.kubecost.xyz/model/allocation?aggregate=pod&filterPods=kubecost-prometheus-node-exporter-qdpd5&window=7d&accumulate=true&shareIdle=false&splitIdle=false&idleByNode=false&shareTenancyCosts=true&shareNamespaces=&shareCost=NaN&shareSplit=weighted&chartType=1&costMetric=1&startIndex=0&maxResults=0&req=1677803551437
{
  "code": 200,
  "data": [
    {
      "__idle__": {
        "name": "__idle__",
        "properties": {
          "cluster": "cluster-one"
        },
        "window": {
          "start": "2023-03-09T00:00:00Z",
          "end": "2023-03-16T00:00:00Z"
        },
        "start": "2023-03-09T00:00:00Z",
        "end": "2023-03-15T19:50:00Z",
        "minutes": 9830,
        "cpuCores": 0,
        "cpuCoreRequestAverage": 0,
        "cpuCoreUsageAverage": 0,
        "cpuCoreHours": 0,
        "cpuCost": 0.05388,
        "cpuCostAdjustment": 0,
        "cpuEfficiency": 0,
        "gpuCount": 0,
        "gpuHours": 0,
        "gpuCost": 0,
        "gpuCostAdjustment": 0,
        "networkTransferBytes": 0,
        "networkReceiveBytes": 0,
        "networkCost": 0,
        "networkCrossZoneCost": 0,
        "networkCrossRegionCost": 0,
        "networkInternetCost": 0,
        "networkCostAdjustment": 0,
        "loadBalancerCost": 0,
        "loadBalancerCostAdjustment": 0,
        "pvBytes": 0,
        "pvByteHours": 0,
        "pvCost": 0,
        "pvs": null,
        "pvCostAdjustment": 0,
        "ramBytes": 0,
        "ramByteRequestAverage": 0,
        "ramByteUsageAverage": 0,
        "ramByteHours": 0,
        "ramCost": 0.08435,
        "ramCostAdjustment": 0,
        "ramEfficiency": 0,
        "externalCost": 0,
        "sharedCost": 0,
        "totalCost": 0.13824,
        "totalEfficiency": 0,
        "rawAllocationOnly": null
      },
      "kubecost-prometheus-node-exporter-qdpd5": {
        "name": "kubecost-prometheus-node-exporter-qdpd5",
        "properties": {
          "cluster": "cluster-one",
          "node": "aks-agentpool-41384482-vmss000001",
          "container": "prometheus-node-exporter",
          "controller": "kubecost-prometheus-node-exporter",
          "controllerKind": "daemonset",
          "namespace": "kubecost",
          "pod": "kubecost-prometheus-node-exporter-qdpd5",
          "providerID": "azure:///subscriptions/0bd50fdf-c923-4e1e-850c-196dd3dcc5d3/resourceGroups/mc_khandkcost_khand-dev-1_eastus/providers/Microsoft.Compute/virtualMachineScaleSets/aks-agentpool-41384482-vmss/virtualMachines/1"
        },
        "window": {
          "start": "2023-03-09T00:00:00Z",
          "end": "2023-03-16T00:00:00Z"
        },
        "start": "2023-03-09T00:00:00Z",
        "end": "2023-03-15T19:50:00Z",
        "minutes": 9830,
        "cpuCores": 0.00182,
        "cpuCoreRequestAverage": 0,
        "cpuCoreUsageAverage": 0.00238,
        "cpuCoreHours": 0.29871,
        "cpuCost": 0.00813,
        "cpuCostAdjustment": 0,
        "cpuEfficiency": 1,
        "gpuCount": 0,
        "gpuHours": 0,
        "gpuCost": 0,
        "gpuCostAdjustment": 0,
        "networkTransferBytes": 0,
        "networkReceiveBytes": 0,
        "networkCost": 381.00012,
        "networkCrossZoneCost": 0,
        "networkCrossRegionCost": 0,
        "networkInternetCost": 381.00012,
        "networkCostAdjustment": 0,
        "loadBalancerCost": 0,
        "loadBalancerCostAdjustment": 0,
        "pvBytes": 0,
        "pvByteHours": 0,
        "pvCost": 0,
        "pvs": null,
        "pvCostAdjustment": 0,
        "ramBytes": 25054822.5365,
        "ramByteRequestAverage": 0,
        "ramByteUsageAverage": 27554335.87209,
        "ramByteHours": 4104815092.22989,
        "ramCost": 0.01395,
        "ramCostAdjustment": 0,
        "ramEfficiency": 1,
        "externalCost": 0,
        "sharedCost": 0.38676,
        "totalCost": 381.40897,
        "totalEfficiency": 1,
        "rawAllocationOnly": null
      }
    }
  ]
}
AjayTripathy commented 1 year ago

Seems to me like the underlying query was broken. https://github.com/opencost/opencost/pull/1774/files for a fix.

mclean0328 commented 1 year ago

@AjayTripathy catching up here. Is your proposed change a fix for this issue?

AjayTripathy commented 1 year ago

Yes.