Closed mclean0328 closed 1 year ago
Hi @mclean0328 could you share the kubecost version you're on? I have seen this before but I believe this got fixed in a later version.
@AjayTripathy just checked and version is v1.99.0
Tagging @nealormsbee - I know we just updated this page, would the new API calls fix this by chance?
Unfortunately no -- the API update was for orphaned resources. We'll look at how we present data for this page -- maybe there's an obvious issue. I may need to come back and ask for reproduction details.
I'm a moron - My fault.
Propose this is P1, but let me know if others are feeling differently
Checked this out and wasn't able to reproduce an issue. From a code perspective, we're just taking this query http://nightly.kubecost.io:9090/model/savings/abandonedWorkloads?days=2&threshold=500
and rendering the respective egressBytesPerSecond
ingressBytesPerSecond
directly from the payload. Maybe there's a larger backend situation afoot?
@teevans, @wolfeaustin, Should I route to core team to research?
Yessir!
At a glance, no issues here. So I'll need an environment with an active repro.
@mclean0328 can you provide that? And ideally the values.yaml
for it? (Is it kc-demo-stage
as the screenshot indicates?)
I have a theory here...is it possible that we're getting network costs via reconciliation and peanut butter spreading them to a workload with no actual traffic? We should normalize reconciliation by number of bytes consumed by the pod ... @Sean-Holcomb would be the expert here.
Yeah, I think that might be right, Ajay. I was barking up the same tree, wanting to see the values.yaml
. But if network daemonset is enabled, and is working... I have no thoughts. Would have to just dig in and investigate where along the chain things are dropping out.
Although, wait... this is not network daemonset related.
queryFmtNetReceiveBytes = `sum(increase(container_network_receive_bytes_total{pod!="", container="POD"}[%s])) by (pod_name, pod, namespace, %s)`
queryFmtNetTransferBytes = `sum(increase(container_network_transmit_bytes_total{pod!="", container="POD"}[%s])) by (pod_name, pod, namespace, %s)`
RIght. But I think check the code... reconciliation will apply cost regardless of whether or not there is traffic whereas it probably should spread costs proportionally based on those numbers.
been I while but network cost are distributed by weight if there are traffic numbers and evenly if the network pod is not set up.
So yeah, that sounds like exactly the issue^
Can we do one better here and go:
Yeah, agree with Ajay. This would be great to revisit w/r/t reconciliation-from-Cloud-Costs work. (We can also fix in the current implementation, but might be nice to not duplicate work.)
And to clarify, "raw byte numbers" come from cAdvisor's container_network_...
metrics, which we should always have. If those don't exist, then maybe we need a third fallback, but that should be rare.
One confounding issue here is that the network daemonset is, apparently, enabled on this cluster. So that cuts against this hypothesis and line of reasoning. But I've not been able to connect to the cluster to find out more. All I know is that we think these are the values.yaml
used: https://github.com/kubecost/se-demo/blob/main/kc-demo/aws-kc-demo-prod-primary/kc-demo-prod-helm-values.yaml
@nikovacevic - THis feels like we need to punt. Tentatively marking for 1.101.
Customer responded (Enterprise customer)
"Thanks for getting back. We are facing serious issue due to incorrect forecasting values. Our business needs to understand how much we could saving in in Q2 on our kubernetes workload. We cannot provide that as Kubecost dashboard figures may incorrect. This will have an impact to our financial goals. Appreciate an ETA on the fix"
cc @teevans If we have any update we can provide to customers
@nikovacevic / @AjayTripathy - What can I do to help us figure this one out?
@teevans I spoke with Ajay on this one; how about I broker a call with the customer and we can get someone from engineering on the call? Who would be a good pick?
This needs to be broken into two parts.
Confirm that the hypothesis (those pods actually are egressing/ingressing 0bytes and network reconciliation is allocating them a cost share anyway since we spread the cost of a node across all pods on that node evenly) is what's causing the discrepancy for the user. Explain that issue clearly and document. We shouldn't need someone on engineering to do this; just find a few pods with 0B ingress/egress and run the above prometheus queries https://github.com/kubecost/cost-analyzer-helm-chart/issues/1917#issuecomment-1414971515 with the correct podnames templated in and double check that they are actually zero or close to zero.
Migrate network reconciliation from a peanut-butter spread per pod on a node to a proportional split based on byte egress. That's for engineering to do, but we should confirm that's actually the issue with #1 before doing that.
@rossfisherkc were we able to get these steps out to the user to confirm the hypothesis?
It seems like pods with traffic are showing 0 for ingress and ingress in the Abandoned Workloads:
az aks get-credentials --resource-group khandkcost --name khand-dev-1
Customer confirmed the same in ZD #3524. I'm waiting for them run Prometheus queries.
I'm working with the customer on writing some API queries Here is what they were able to share:
`container_network_receive_bytes_total`
``` {
beta_kubernetes_io_arch: "amd64",
beta_kubernetes_io_instance_type: "custom-16-32768",
beta_kubernetes_io_os: "linux",
business: "REDACTED",
cloud_google_com_gke_boot_disk: "pd-standard",
cloud_google_com_gke_container_runtime: "containerd",
cloud_google_com_gke_cpu_scaling_level: "16",
cloud_google_com_gke_max_pods_per_node: "110",
cloud_google_com_gke_nodepool: "REDACTED",
cloud_google_com_gke_os_distribution: "cos",
cloud_google_com_machine_family: "custom-16",
cluster_name: "REDACTED",
env: "production",
failure_domain_beta_kubernetes_io_region: "REDACTED",
failure_domain_beta_kubernetes_io_zone: "REDACTED",
id: "/kubepods/REDACTED",
image: "< REDACTED >",
instance: "REDACTED",
interface: "eth0",
job: "kubernetes-nodes-cadvisor",
kubernetes_io_arch: "amd64",
kubernetes_io_hostname: "REDACTED",
kubernetes_io_os: "linux",
name: "REDACTED",
namespace: "production",
node_kubernetes_io_instance_type: "custom-16-32768",
pod: "cms-worker-REDACTED",
pod_name: "cms-worker-REDACTED",
team: "platform",
topology_kubernetes_io_region: "REDACTED",
topology_kubernetes_io_zone: "REDACTED",
}```
`container_network_transmit_bytes_total`
``` {
beta_kubernetes_io_arch: "amd64",
beta_kubernetes_io_instance_type: "custom-16-32768",
beta_kubernetes_io_os: "linux",
business: "REDACTED",
cloud_google_com_gke_boot_disk: "pd-standard",
cloud_google_com_gke_container_runtime: "containerd",
cloud_google_com_gke_cpu_scaling_level: "16",
cloud_google_com_gke_max_pods_per_node: "110",
cloud_google_com_gke_nodepool: "REDACTED",
cloud_google_com_gke_os_distribution: "cos",
cloud_google_com_machine_family: "custom-16",
cluster_name: "REDACTED",
env: "production",
failure_domain_beta_kubernetes_io_region: "us-REDACTED",
failure_domain_beta_kubernetes_io_zone: "us-REDACTED",
id: "/kubepods/burstable/REDACTED",
image: "<http://k8s.gcr.io/REDACTED >",
instance: "REDACTED",
interface: "eth0",
job: "kubernetes-nodes-cadvisor",
kubernetes_io_arch: "amd64",
kubernetes_io_hostname: "REDACTED",
kubernetes_io_os: "linux",
name: "REDACTED",
namespace: "production",
node_kubernetes_io_instance_type: "custom-16-32768",
pod: "REDACTED",
pod_name: "REDACTED",
team: "platform",
topology_kubernetes_io_region: "us-REDACTED",
topology_kubernetes_io_zone: "us-REDACTED-a",
}``````
Ok-- @ivankube @rossfisherkc the next step would be to see what the allocations API says for the same time period.
/model/allocation?aggregate=node&window=7d&accumulate=true&shareIdle=false&splitIdle=false&idleByNode=false&shareTenancyCosts=true&shareNamespaces=&shareCost=NaN&shareSplit=weighted&chartType=1&costMetric=1&startIndex=0&maxResults=0&req=1677803551437
Something like this.
{"code":200,"data":[{"":{"name":"","properties":{"cluster":"cluster-one","container":"__unmounted__"},"window":{"start":"2023-03-08T00:00:00Z","end":"2023-03-15T00:00:00Z"},"start":"2023-03-08T00:00:00Z","end":"2023-03-14T15:00:00Z","minutes":9540,"cpuCores":0,"cpuCoreRequestAverage":0,"cpuCoreUsageAverage":0,"cpuCoreHours":0,"cpuCost":0,"cpuCostAdjustment":0,"cpuEfficiency":0,"gpuCount":0,"gpuHours":0,"gpuCost":0,"gpuCostAdjustment":0,"networkTransferBytes":0,"networkReceiveBytes":0,"networkCost":0,"networkCrossZoneCost":0,"networkCrossRegionCost":0,"networkInternetCost":0,"networkCostAdjustment":0,"loadBalancerCost":0,"loadBalancerCostAdjustment":0,"pvBytes":36521247671.17561,"pvByteHours":5806878379716.92,"pvCost":0.29633,"pvs":{"cluster=cluster-one:name=pvc-2bd8a964-1bd7-4ed9-a7b7-c61874e954a8":{"byteHours":426192908603.0767,"cost":0.02174920947692307},"cluster=cluster-one:name=pvc-3230089c-d9ad-4944-91e7-3da6cf9caf48":{"byteHours":106548227150.76918,"cost":0.005437302369230767},"cluster=cluster-one:name=pvc-702d44f2-d686-44c0-a71c-e7fe6896b94d":{"byteHours":1331852839384.6147,"cost":0.06796627961538462},"cluster=cluster-one:name=pvc-732044fd-2f39-4335-846a-5bfe40a2d4c0":{"byteHours":426192908603.0767,"cost":0.02174920947692307},"cluster=cluster-one:name=pvc-af6e32a9-fe29-476e-8bfa-992b4d4bc397":{"byteHours":426192908603.0767,"cost":0.02174920947692307},"cluster=cluster-one:name=pvc-d9236d99-0d44-4095-acce-c20dd6b6f569":{"byteHours":1331852839384.6147,"cost":0.06796627961538462},"cluster=cluster-one:name=pvc-d957a2f4-b3d4-49f0-b971-40f368d6ab76":{"byteHours":1331852839384.6147,"cost":0.06796627961538462},"cluster=cluster-one:name=pvc-fd096da6-5a94-48e0-8147-434872d73e50":{"byteHours":426192908603.0767,"cost":0.02174920947692307}},"pvCostAdjustment":0.20522,"ramBytes":0,"ramByteRequestAverage":0,"ramByteUsageAverage":0,"ramByteHours":0,"ramCost":0,"ramCostAdjustment":0,"ramEfficiency":0,"externalCost":0,"sharedCost":0,"totalCost":0.50155,"totalEfficiency":0,"rawAllocationOnly":null},"__idle__":{"name":"__idle__","properties":{"cluster":"cluster-one"},"window":{"start":"2023-03-08T00:00:00Z","end":"2023-03-15T00:00:00Z"},"start":"2023-03-08T00:00:00Z","end":"2023-03-14T12:00:00Z","minutes":9360,"cpuCores":0,"cpuCoreRequestAverage":0,"cpuCoreUsageAverage":0,"cpuCoreHours":0,"cpuCost":26.6091,"cpuCostAdjustment":0,"cpuEfficiency":0,"gpuCount":0,"gpuHours":0,"gpuCost":0,"gpuCostAdjustment":0,"networkTransferBytes":0,"networkReceiveBytes":0,"networkCost":0,"networkCrossZoneCost":0,"networkCrossRegionCost":0,"networkInternetCost":0,"networkCostAdjustment":0,"loadBalancerCost":0,"loadBalancerCostAdjustment":0,"pvBytes":0,"pvByteHours":0,"pvCost":0,"pvs":null,"pvCostAdjustment":0,"ramBytes":0,"ramByteRequestAverage":0,"ramByteUsageAverage":0,"ramByteHours":0,"ramCost":12.27403,"ramCostAdjustment":0,"ramEfficiency":0,"externalCost":0,"sharedCost":0,"totalCost":38.88312,"totalEfficiency":0,"rawAllocationOnly":null},"aks-agentpool-41384482-vmss000000":{"name":"aks-agentpool-41384482-vmss000000","properties":{"cluster":"cluster-one","node":"aks-agentpool-41384482-vmss000000","providerID":"azure:///subscriptions/0bd50fdf-c923-4e1e-850c-196dd3dcc5d3/resourceGroups/mc_khandkcost_khand-dev-1_eastus/providers/Microsoft.Compute/virtualMachineScaleSets/aks-agentpool-41384482-vmss/virtualMachines/0"},"window":{"start":"2023-03-08T00:00:00Z","end":"2023-03-15T00:00:00Z"},"start":"2023-03-08T00:00:00Z","end":"2023-03-14T14:15:00Z","minutes":9495,"cpuCores":1.20213,"cpuCoreRequestAverage":1.11413,"cpuCoreUsageAverage":0.16433,"cpuCoreHours":190.23736,"cpuCost":5.18073,"cpuCostAdjustment":0.00011,"cpuEfficiency":0.14749,"gpuCount":0,"gpuHours":0,"gpuCost":0,"gpuCostAdjustment":0,"networkTransferBytes":0,"networkReceiveBytes":0,"networkCost":0.80541,"networkCrossZoneCost":0,"networkCrossRegionCost":0,"networkInternetCost":0.80541,"networkCostAdjustment":0,"loadBalancerCost":0,"loadBalancerCostAdjustment":0,"pvBytes":103048682161.13744,"pvByteHours":16307453952000,"pvCost":0.83219,"pvs":{"cluster=cluster-one:name=pvc-3230089c-d9ad-4944-91e7-3da6cf9caf48":{"byteHours":1254419534769.231,"cost":0.06401475173076923},"cluster=cluster-one:name=pvc-732044fd-2f39-4335-846a-5bfe40a2d4c0":{"byteHours":5017678139076.924,"cost":0.2560590069230769},"cluster=cluster-one:name=pvc-af6e32a9-fe29-476e-8bfa-992b4d4bc397":{"byteHours":5017678139076.922,"cost":0.25605900692307687},"cluster=cluster-one:name=pvc-fd096da6-5a94-48e0-8147-434872d73e50":{"byteHours":5017678139076.924,"cost":0.2560590069230769}},"pvCostAdjustment":0.61239,"ramBytes":5096269863.01574,"ramByteRequestAverage":3696243652.14534,"ramByteUsageAverage":2052567556.36228,"ramByteHours":806484705822.2408,"ramCost":2.74151,"ramCostAdjustment":0.00006,"ramEfficiency":0.55531,"externalCost":0,"sharedCost":0.0106,"totalCost":10.18301,"totalEfficiency":0.28862,"rawAllocationOnly":null},"aks-agentpool-41384482-vmss000001":{"name":"aks-agentpool-41384482-vmss000001","properties":{"cluster":"cluster-one","node":"aks-agentpool-41384482-vmss000001","providerID":"azure:///subscriptions/0bd50fdf-c923-4e1e-850c-196dd3dcc5d3/resourceGroups/mc_khandkcost_khand-dev-1_eastus/providers/Microsoft.Compute/virtualMachineScaleSets/aks-agentpool-41384482-vmss/virtualMachines/1"},"window":{"start":"2023-03-08T00:00:00Z","end":"2023-03-15T00:00:00Z"},"start":"2023-03-08T00:00:00Z","end":"2023-03-14T14:15:00Z","minutes":9495,"cpuCores":0.50991,"cpuCoreRequestAverage":0.48024,"cpuCoreUsageAverage":0.08298,"cpuCoreHours":80.69293,"cpuCost":2.19751,"cpuCostAdjustment":0.00004,"cpuEfficiency":0.1728,"gpuCount":0,"gpuHours":0,"gpuCost":0,"gpuCostAdjustment":0,"networkTransferBytes":0,"networkReceiveBytes":0,"networkCost":3060.37968,"networkCrossZoneCost":0,"networkCrossRegionCost":0,"networkInternetCost":3060.37968,"networkCostAdjustment":0,"loadBalancerCost":0.8575,"loadBalancerCostAdjustment":-0.05,"pvBytes":328963100745.16956,"pvByteHours":52058410692923.08,"pvCost":2.65661,"pvs":{"cluster=cluster-one:name=pvc-2bd8a964-1bd7-4ed9-a7b7-c61874e954a8":{"byteHours":5017678139076.924,"cost":0.2560590069230769},"cluster=cluster-one:name=pvc-702d44f2-d686-44c0-a71c-e7fe6896b94d":{"byteHours":15680244184615.385,"cost":0.800184396634615},"cluster=cluster-one:name=pvc-d9236d99-0d44-4095-acce-c20dd6b6f569":{"byteHours":15680244184615.385,"cost":0.800184396634615},"cluster=cluster-one:name=pvc-d957a2f4-b3d4-49f0-b971-40f368d6ab76":{"byteHours":15680244184615.385,"cost":0.800184396634615}},"pvCostAdjustment":1.85024,"ramBytes":5183689897.47478,"ramByteRequestAverage":4735645302.36124,"ramByteUsageAverage":966905161.63903,"ramByteHours":820318926275.3838,"ramCost":2.78853,"ramCostAdjustment":0.00006,"ramEfficiency":0.20418,"externalCost":0,"sharedCost":3.19477,"totalCost":3073.87495,"totalEfficiency":0.19035,"rawAllocationOnly":null}}]}
making this readable
{
"code": 200,
"data": [
{
"": {
"name": "",
"properties": {
"cluster": "cluster-one",
"container": "__unmounted__"
},
"window": {
"start": "2023-03-08T00:00:00Z",
"end": "2023-03-15T00:00:00Z"
},
"start": "2023-03-08T00:00:00Z",
"end": "2023-03-14T15:00:00Z",
"minutes": 9540,
"cpuCores": 0,
"cpuCoreRequestAverage": 0,
"cpuCoreUsageAverage": 0,
"cpuCoreHours": 0,
"cpuCost": 0,
"cpuCostAdjustment": 0,
"cpuEfficiency": 0,
"gpuCount": 0,
"gpuHours": 0,
"gpuCost": 0,
"gpuCostAdjustment": 0,
"networkTransferBytes": 0,
"networkReceiveBytes": 0,
"networkCost": 0,
"networkCrossZoneCost": 0,
"networkCrossRegionCost": 0,
"networkInternetCost": 0,
"networkCostAdjustment": 0,
"loadBalancerCost": 0,
"loadBalancerCostAdjustment": 0,
"pvBytes": 36521247671.17561,
"pvByteHours": 5806878379716.92,
"pvCost": 0.29633,
"pvs": {
"cluster=cluster-one:name=pvc-2bd8a964-1bd7-4ed9-a7b7-c61874e954a8": {
"byteHours": 426192908603.0767,
"cost": 0.02174920947692307
},
"cluster=cluster-one:name=pvc-3230089c-d9ad-4944-91e7-3da6cf9caf48": {
"byteHours": 106548227150.76918,
"cost": 0.005437302369230767
},
"cluster=cluster-one:name=pvc-702d44f2-d686-44c0-a71c-e7fe6896b94d": {
"byteHours": 1331852839384.6147,
"cost": 0.06796627961538462
},
"cluster=cluster-one:name=pvc-732044fd-2f39-4335-846a-5bfe40a2d4c0": {
"byteHours": 426192908603.0767,
"cost": 0.02174920947692307
},
"cluster=cluster-one:name=pvc-af6e32a9-fe29-476e-8bfa-992b4d4bc397": {
"byteHours": 426192908603.0767,
"cost": 0.02174920947692307
},
"cluster=cluster-one:name=pvc-d9236d99-0d44-4095-acce-c20dd6b6f569": {
"byteHours": 1331852839384.6147,
"cost": 0.06796627961538462
},
"cluster=cluster-one:name=pvc-d957a2f4-b3d4-49f0-b971-40f368d6ab76": {
"byteHours": 1331852839384.6147,
"cost": 0.06796627961538462
},
"cluster=cluster-one:name=pvc-fd096da6-5a94-48e0-8147-434872d73e50": {
"byteHours": 426192908603.0767,
"cost": 0.02174920947692307
}
},
"pvCostAdjustment": 0.20522,
"ramBytes": 0,
"ramByteRequestAverage": 0,
"ramByteUsageAverage": 0,
"ramByteHours": 0,
"ramCost": 0,
"ramCostAdjustment": 0,
"ramEfficiency": 0,
"externalCost": 0,
"sharedCost": 0,
"totalCost": 0.50155,
"totalEfficiency": 0,
"rawAllocationOnly": null
},
"__idle__": {
"name": "__idle__",
"properties": {
"cluster": "cluster-one"
},
"window": {
"start": "2023-03-08T00:00:00Z",
"end": "2023-03-15T00:00:00Z"
},
"start": "2023-03-08T00:00:00Z",
"end": "2023-03-14T12:00:00Z",
"minutes": 9360,
"cpuCores": 0,
"cpuCoreRequestAverage": 0,
"cpuCoreUsageAverage": 0,
"cpuCoreHours": 0,
"cpuCost": 26.6091,
"cpuCostAdjustment": 0,
"cpuEfficiency": 0,
"gpuCount": 0,
"gpuHours": 0,
"gpuCost": 0,
"gpuCostAdjustment": 0,
"networkTransferBytes": 0,
"networkReceiveBytes": 0,
"networkCost": 0,
"networkCrossZoneCost": 0,
"networkCrossRegionCost": 0,
"networkInternetCost": 0,
"networkCostAdjustment": 0,
"loadBalancerCost": 0,
"loadBalancerCostAdjustment": 0,
"pvBytes": 0,
"pvByteHours": 0,
"pvCost": 0,
"pvs": null,
"pvCostAdjustment": 0,
"ramBytes": 0,
"ramByteRequestAverage": 0,
"ramByteUsageAverage": 0,
"ramByteHours": 0,
"ramCost": 12.27403,
"ramCostAdjustment": 0,
"ramEfficiency": 0,
"externalCost": 0,
"sharedCost": 0,
"totalCost": 38.88312,
"totalEfficiency": 0,
"rawAllocationOnly": null
},
"aks-agentpool-41384482-vmss000000": {
"name": "aks-agentpool-41384482-vmss000000",
"properties": {
"cluster": "cluster-one",
"node": "aks-agentpool-41384482-vmss000000",
"providerID": "azure:///subscriptions/0bd50fdf-c923-4e1e-850c-196dd3dcc5d3/resourceGroups/mc_khandkcost_khand-dev-1_eastus/providers/Microsoft.Compute/virtualMachineScaleSets/aks-agentpool-41384482-vmss/virtualMachines/0"
},
"window": {
"start": "2023-03-08T00:00:00Z",
"end": "2023-03-15T00:00:00Z"
},
"start": "2023-03-08T00:00:00Z",
"end": "2023-03-14T14:15:00Z",
"minutes": 9495,
"cpuCores": 1.20213,
"cpuCoreRequestAverage": 1.11413,
"cpuCoreUsageAverage": 0.16433,
"cpuCoreHours": 190.23736,
"cpuCost": 5.18073,
"cpuCostAdjustment": 0.00011,
"cpuEfficiency": 0.14749,
"gpuCount": 0,
"gpuHours": 0,
"gpuCost": 0,
"gpuCostAdjustment": 0,
"networkTransferBytes": 0,
"networkReceiveBytes": 0,
"networkCost": 0.80541,
"networkCrossZoneCost": 0,
"networkCrossRegionCost": 0,
"networkInternetCost": 0.80541,
"networkCostAdjustment": 0,
"loadBalancerCost": 0,
"loadBalancerCostAdjustment": 0,
"pvBytes": 103048682161.13744,
"pvByteHours": 16307453952000,
"pvCost": 0.83219,
"pvs": {
"cluster=cluster-one:name=pvc-3230089c-d9ad-4944-91e7-3da6cf9caf48": {
"byteHours": 1254419534769.231,
"cost": 0.06401475173076923
},
"cluster=cluster-one:name=pvc-732044fd-2f39-4335-846a-5bfe40a2d4c0": {
"byteHours": 5017678139076.924,
"cost": 0.2560590069230769
},
"cluster=cluster-one:name=pvc-af6e32a9-fe29-476e-8bfa-992b4d4bc397": {
"byteHours": 5017678139076.922,
"cost": 0.25605900692307687
},
"cluster=cluster-one:name=pvc-fd096da6-5a94-48e0-8147-434872d73e50": {
"byteHours": 5017678139076.924,
"cost": 0.2560590069230769
}
},
"pvCostAdjustment": 0.61239,
"ramBytes": 5096269863.01574,
"ramByteRequestAverage": 3696243652.14534,
"ramByteUsageAverage": 2052567556.36228,
"ramByteHours": 806484705822.2408,
"ramCost": 2.74151,
"ramCostAdjustment": 0.00006,
"ramEfficiency": 0.55531,
"externalCost": 0,
"sharedCost": 0.0106,
"totalCost": 10.18301,
"totalEfficiency": 0.28862,
"rawAllocationOnly": null
},
"aks-agentpool-41384482-vmss000001": {
"name": "aks-agentpool-41384482-vmss000001",
"properties": {
"cluster": "cluster-one",
"node": "aks-agentpool-41384482-vmss000001",
"providerID": "azure:///subscriptions/0bd50fdf-c923-4e1e-850c-196dd3dcc5d3/resourceGroups/mc_khandkcost_khand-dev-1_eastus/providers/Microsoft.Compute/virtualMachineScaleSets/aks-agentpool-41384482-vmss/virtualMachines/1"
},
"window": {
"start": "2023-03-08T00:00:00Z",
"end": "2023-03-15T00:00:00Z"
},
"start": "2023-03-08T00:00:00Z",
"end": "2023-03-14T14:15:00Z",
"minutes": 9495,
"cpuCores": 0.50991,
"cpuCoreRequestAverage": 0.48024,
"cpuCoreUsageAverage": 0.08298,
"cpuCoreHours": 80.69293,
"cpuCost": 2.19751,
"cpuCostAdjustment": 0.00004,
"cpuEfficiency": 0.1728,
"gpuCount": 0,
"gpuHours": 0,
"gpuCost": 0,
"gpuCostAdjustment": 0,
"networkTransferBytes": 0,
"networkReceiveBytes": 0,
"networkCost": 3060.37968,
"networkCrossZoneCost": 0,
"networkCrossRegionCost": 0,
"networkInternetCost": 3060.37968,
"networkCostAdjustment": 0,
"loadBalancerCost": 0.8575,
"loadBalancerCostAdjustment": -0.05,
"pvBytes": 328963100745.16956,
"pvByteHours": 52058410692923.08,
"pvCost": 2.65661,
"pvs": {
"cluster=cluster-one:name=pvc-2bd8a964-1bd7-4ed9-a7b7-c61874e954a8": {
"byteHours": 5017678139076.924,
"cost": 0.2560590069230769
},
"cluster=cluster-one:name=pvc-702d44f2-d686-44c0-a71c-e7fe6896b94d": {
"byteHours": 15680244184615.385,
"cost": 0.800184396634615
},
"cluster=cluster-one:name=pvc-d9236d99-0d44-4095-acce-c20dd6b6f569": {
"byteHours": 15680244184615.385,
"cost": 0.800184396634615
},
"cluster=cluster-one:name=pvc-d957a2f4-b3d4-49f0-b971-40f368d6ab76": {
"byteHours": 15680244184615.385,
"cost": 0.800184396634615
}
},
"pvCostAdjustment": 1.85024,
"ramBytes": 5183689897.47478,
"ramByteRequestAverage": 4735645302.36124,
"ramByteUsageAverage": 966905161.63903,
"ramByteHours": 820318926275.3838,
"ramCost": 2.78853,
"ramCostAdjustment": 0.00006,
"ramEfficiency": 0.20418,
"externalCost": 0,
"sharedCost": 3.19477,
"totalCost": 3073.87495,
"totalEfficiency": 0.19035,
"rawAllocationOnly": null
}
}
]
}
@ivankube can you remove the aggregate by node and instead do aggregate by pod filtered down to the pod that they're looking for, and make sure we get data for that pod?
@AjayTripathy This is my test cluster.
"code": 200,
"data": [
{
"__idle__": {
"name": "__idle__",
"properties": {
"cluster": "cluster-one"
},
"window": {
"start": "2023-03-09T00:00:00Z",
"end": "2023-03-16T00:00:00Z"
},
"start": "2023-03-09T00:00:00Z",
"end": "2023-03-15T19:40:00Z",
"minutes": 9820,
"cpuCores": 0,
"cpuCoreRequestAverage": 0,
"cpuCoreUsageAverage": 0,
"cpuCoreHours": 0,
"cpuCost": 0.88731,
"cpuCostAdjustment": 0,
"cpuEfficiency": 0,
"gpuCount": 0,
"gpuHours": 0,
"gpuCost": 0,
"gpuCostAdjustment": 0,
"networkTransferBytes": 0,
"networkReceiveBytes": 0,
"networkCost": 0,
"networkCrossZoneCost": 0,
"networkCrossRegionCost": 0,
"networkInternetCost": 0,
"networkCostAdjustment": 0,
"loadBalancerCost": 0,
"loadBalancerCostAdjustment": 0,
"pvBytes": 0,
"pvByteHours": 0,
"pvCost": 0,
"pvs": null,
"pvCostAdjustment": 0,
"ramBytes": 0,
"ramByteRequestAverage": 0,
"ramByteUsageAverage": 0,
"ramByteHours": 0,
"ramCost": 0.0592,
"ramCostAdjustment": 0,
"ramEfficiency": 0,
"externalCost": 0,
"sharedCost": 0,
"totalCost": 0.94651,
"totalEfficiency": 0,
"rawAllocationOnly": null
},
"kubecost-network-costs-jwnbt": {
"name": "kubecost-network-costs-jwnbt",
"properties": {
"cluster": "cluster-one",
"node": "aks-agentpool-41384482-vmss000001",
"container": "kubecost-network-costs",
"controller": "kubecost-network-costs",
"controllerKind": "daemonset",
"namespace": "kubecost",
"pod": "kubecost-network-costs-jwnbt",
"providerID": "azure:///subscriptions/0bd50fdf-c923-4e1e-850c-196dd3dcc5d3/resourceGroups/mc_khandkcost_khand-dev-1_eastus/providers/Microsoft.Compute/virtualMachineScaleSets/aks-agentpool-41384482-vmss/virtualMachines/1"
},
"window": {
"start": "2023-03-09T00:00:00Z",
"end": "2023-03-16T00:00:00Z"
},
"start": "2023-03-09T00:00:00Z",
"end": "2023-03-14T20:20:00Z",
"minutes": 8420,
"cpuCores": 0.0517,
"cpuCoreRequestAverage": 0.05,
"cpuCoreUsageAverage": 0.04053,
"cpuCoreHours": 7.25514,
"cpuCost": 0.19758,
"cpuCostAdjustment": 0,
"cpuEfficiency": 0.81058,
"gpuCount": 0,
"gpuHours": 0,
"gpuCost": 0,
"gpuCostAdjustment": 0,
"networkTransferBytes": 0,
"networkReceiveBytes": 0,
"networkCost": 339.43089,
"networkCrossZoneCost": 0,
"networkCrossRegionCost": 0,
"networkInternetCost": 339.43089,
"networkCostAdjustment": 0,
"loadBalancerCost": 0,
"loadBalancerCostAdjustment": 0,
"pvBytes": 0,
"pvByteHours": 0,
"pvCost": 0,
"pvs": null,
"pvCostAdjustment": 0,
"ramBytes": 41132970.55903,
"ramByteRequestAverage": 20971520,
"ramByteUsageAverage": 44044950.76353,
"ramByteHours": 5772326868.45058,
"ramCost": 0.01962,
"ramCostAdjustment": 0,
"ramEfficiency": 2.10023,
"externalCost": 0,
"sharedCost": 0.33305,
"totalCost": 339.98114,
"totalEfficiency": 0.92708,
"rawAllocationOnly": null
}
}
]
}
http://ivan.kubecost.xyz/model/allocation?aggregate=pod&filterPods=kubecost-prometheus-node-exporter-qdpd5&window=7d&accumulate=true&shareIdle=false&splitIdle=false&idleByNode=false&shareTenancyCosts=true&shareNamespaces=&shareCost=NaN&shareSplit=weighted&chartType=1&costMetric=1&startIndex=0&maxResults=0&req=1677803551437
{
"code": 200,
"data": [
{
"__idle__": {
"name": "__idle__",
"properties": {
"cluster": "cluster-one"
},
"window": {
"start": "2023-03-09T00:00:00Z",
"end": "2023-03-16T00:00:00Z"
},
"start": "2023-03-09T00:00:00Z",
"end": "2023-03-15T19:50:00Z",
"minutes": 9830,
"cpuCores": 0,
"cpuCoreRequestAverage": 0,
"cpuCoreUsageAverage": 0,
"cpuCoreHours": 0,
"cpuCost": 0.05388,
"cpuCostAdjustment": 0,
"cpuEfficiency": 0,
"gpuCount": 0,
"gpuHours": 0,
"gpuCost": 0,
"gpuCostAdjustment": 0,
"networkTransferBytes": 0,
"networkReceiveBytes": 0,
"networkCost": 0,
"networkCrossZoneCost": 0,
"networkCrossRegionCost": 0,
"networkInternetCost": 0,
"networkCostAdjustment": 0,
"loadBalancerCost": 0,
"loadBalancerCostAdjustment": 0,
"pvBytes": 0,
"pvByteHours": 0,
"pvCost": 0,
"pvs": null,
"pvCostAdjustment": 0,
"ramBytes": 0,
"ramByteRequestAverage": 0,
"ramByteUsageAverage": 0,
"ramByteHours": 0,
"ramCost": 0.08435,
"ramCostAdjustment": 0,
"ramEfficiency": 0,
"externalCost": 0,
"sharedCost": 0,
"totalCost": 0.13824,
"totalEfficiency": 0,
"rawAllocationOnly": null
},
"kubecost-prometheus-node-exporter-qdpd5": {
"name": "kubecost-prometheus-node-exporter-qdpd5",
"properties": {
"cluster": "cluster-one",
"node": "aks-agentpool-41384482-vmss000001",
"container": "prometheus-node-exporter",
"controller": "kubecost-prometheus-node-exporter",
"controllerKind": "daemonset",
"namespace": "kubecost",
"pod": "kubecost-prometheus-node-exporter-qdpd5",
"providerID": "azure:///subscriptions/0bd50fdf-c923-4e1e-850c-196dd3dcc5d3/resourceGroups/mc_khandkcost_khand-dev-1_eastus/providers/Microsoft.Compute/virtualMachineScaleSets/aks-agentpool-41384482-vmss/virtualMachines/1"
},
"window": {
"start": "2023-03-09T00:00:00Z",
"end": "2023-03-16T00:00:00Z"
},
"start": "2023-03-09T00:00:00Z",
"end": "2023-03-15T19:50:00Z",
"minutes": 9830,
"cpuCores": 0.00182,
"cpuCoreRequestAverage": 0,
"cpuCoreUsageAverage": 0.00238,
"cpuCoreHours": 0.29871,
"cpuCost": 0.00813,
"cpuCostAdjustment": 0,
"cpuEfficiency": 1,
"gpuCount": 0,
"gpuHours": 0,
"gpuCost": 0,
"gpuCostAdjustment": 0,
"networkTransferBytes": 0,
"networkReceiveBytes": 0,
"networkCost": 381.00012,
"networkCrossZoneCost": 0,
"networkCrossRegionCost": 0,
"networkInternetCost": 381.00012,
"networkCostAdjustment": 0,
"loadBalancerCost": 0,
"loadBalancerCostAdjustment": 0,
"pvBytes": 0,
"pvByteHours": 0,
"pvCost": 0,
"pvs": null,
"pvCostAdjustment": 0,
"ramBytes": 25054822.5365,
"ramByteRequestAverage": 0,
"ramByteUsageAverage": 27554335.87209,
"ramByteHours": 4104815092.22989,
"ramCost": 0.01395,
"ramCostAdjustment": 0,
"ramEfficiency": 1,
"externalCost": 0,
"sharedCost": 0.38676,
"totalCost": 381.40897,
"totalEfficiency": 1,
"rawAllocationOnly": null
}
}
]
}
Seems to me like the underlying query was broken. https://github.com/opencost/opencost/pull/1774/files for a fix.
@AjayTripathy catching up here. Is your proposed change a fix for this issue?
Yes.
Describe the bug
When on the
/abandoned-workloads
view, all statefulset workloads always report ‘Data ingress’ and ‘Data egress’ to have ‘0 B/s’ despite there being significant network traffic.To Reproduce
Steps to reproduce the behavior:
Compare this against the network costs of the same workload found on the ‘Allocations’ page and you will see there are network costs.
Expected behavior
For workloads that are known to have network activity, there should be reported X B/s for ‘Data ingress’ and ‘Data egress’.
Screenshots
┆Issue is synchronized with this Jira Task by Unito