kubecost / features-bugs

A public repository for filing of Kubecost feature requests and bugs. Please read the issue guidelines before filing an issue here.
0 stars 0 forks source link

AWS fargate pricing does not match #93

Open avinavgit opened 1 year ago

avinavgit commented 1 year ago

Describe the bug
We are doing a POC on kubecost version 103.3. Once the Dashboard is up , when i Navigate to - Monitor->select a namespace -> select a deployment and expand the container I get the price and the rates which does not matches with AWS actual rates

To Reproduce
Navigate to - Monitor->select a namespace -> select a deployment and expand the container

  1. In the Kubecost (103.3) UI select Monitor
  2. Select Monitor from left Menu
  3. Select a namespace which is fargate enabled
  4. Select a deployment
  5. Expand the container.
  6. Check the CPU and RAM pricing rate - CPU US$0.032/CoreHR RAM US$0.004/GiBHR

Expected behavior
As per the AWS documentation https://aws.amazon.com/fargate/pricing/ Frankfurt region

per vCPU per hour | $0.04656 per GB per hour | $0.00511

Screenshots
Screen shot showing kubecost pricing considered.

image

Screenshot in AWS documentation https://aws.amazon.com/fargate/pricing/

Screenshot 2023-06-13 at 4 37 25 PM
kwombach12 commented 1 year ago

@nikovacevic Could you take a look at this? Would this be resolved by the fargate improvement proposal you created?

nikovacevic commented 1 year ago

@kwombach12 Not sure, from this report. It could be a few things:

@avinavgit -- in order to help address the issue, can I ask you to run some diagnostic queries? If you can access your Prometheus (if it's a standard install, something like kubectl port-forward service/kubecost-prometheus-server 9003:80, then go to localhost:9003) and run the following:

avg(avg_over_time(node_cpu_hourly_cost{node="the-node-name"}[1d])) by (node, instance_type, provider_id)
avg(avg_over_time(node_ram_hourly_cost{node="the-node-name"}[1d])) by (node, instance_type, provider_id)

(Substituting the real node name, of course, and perhaps the relevant window.) If you can confirm for me what pricing shows up here, then we can narrow down where the issue lies. Thank you!

nikovacevic commented 1 year ago

@kwombach12 @avinavgit my expectation is that, due to the state of our support for Fargate, this is falling back on default pricing. (Our default pricing is CPU": "0.031611", "RAM": "0.004237"--a near perfect match, but tough to say with the rounding on that page.) In which case, yes, the Fargate improvement plan would address this issue.

jcharcalla commented 1 year ago

perhaps yes, and we're simply whiffing on Fargate pricing and falling back on default pricing;

@nikovacevic This may be related... I've finally been able to repro the Error: Invalid Pricing Key "us-west-1,,linux" errors, they only occur when Fargate nodes are part of the cluster. It looks like it may be due to the missing 'instance_type' field.

2023-06-21T18:28:22.195690336Z INF Error getting node pricing. Error: Invalid Pricing Key "us-west-1,,linux"
2023-06-21T18:28:22.195773077Z INF Error getting node pricing. Error: Invalid Pricing Key "us-west-1,,linux"
...
2023-06-21T18:28:49.037681189Z WRN CostModel.ComputeAllocation: Node CPU cost query result missing field: "'instance_type' field does not exist in data result vector" for node "fargate-ip-192-168-115-120.us-west-1.compute.internal"
2023-06-21T18:28:49.037950272Z WRN CostModel.ComputeAllocation: Node CPU cost query result missing field: "'instance_type' field does not exist in data result vector" for node "fargate-ip-192-168-69-30.us-west-1.compute.internal"
...
2023-06-21T18:28:49.039820895Z WRN CostModel.ComputeAllocation: Node spot query result for missing node: cluster-one/fargate-ip-192-168-115-120.us-west-1.compute.internal
2023-06-21T18:28:49.039872776Z WRN CostModel.ComputeAllocation: Node spot query result for missing node: cluster-one/fargate-ip-192-168-69-30.us-west-1.compute.internal

There is also a previous similar Fargate issue here: https://github.com/kubecost/cost-analyzer-helm-chart/issues/2092

nikovacevic commented 1 year ago

Great find @jcharcalla -- I have linked your comment to the Fargate improvement plan ticket and doc, so that we can keep track of the whole constellation of issues.

avinavgit commented 1 year ago

@kwombach12 Not sure, from this report. It could be a few things:

  • perhaps yes, and we're simply whiffing on Fargate pricing and falling back on default pricing;
  • or perhaps we're failing to detect a region, and so we're using pricing from a different region;
  • or it could be that there is a discount getting applied, erroneously;
  • or something not on this list, of course.

@avinavgit -- in order to help address the issue, can I ask you to run some diagnostic queries? If you can access your Prometheus (if it's a standard install, something like kubectl port-forward service/kubecost-prometheus-server 9003:80, then go to localhost:9003) and run the following:

avg(avg_over_time(node_cpu_hourly_cost{node="the-node-name"}[1d])) by (node, instance_type, provider_id)
avg(avg_over_time(node_ram_hourly_cost{node="the-node-name"}[1d])) by (node, instance_type, provider_id)

(Substituting the real node name, of course, and perhaps the relevant window.) If you can confirm for me what pricing shows up here, then we can narrow down where the issue lies. Thank you!

Thank you for the response, I have ran the query and below are the result:

{node="fargate-ip-172-18-27-51.eu-central-1.compute.internal", provider_id="aws:///eu-central-1a/6fb257c2bf-b01e72ebee6e456981223c977a585f30/fargate-ip-172-18-27-51.eu-central-1.compute.internal"} 0.031611
{node="fargate-ip-172-18-27-51.eu-central-1.compute.internal", provider_id="aws:///eu-central-1a/6fb257c2bf-b01e72ebee6e456981223c977a585f30/fargate-ip-172-18-27-51.eu-central-1.compute.internal"} 0.004237
nikovacevic commented 1 year ago

Thank you @avinavgit -- I can confirm that those are precisely the default prices, so there is a high probability that Kubecost is simply not detecting node pricing here at all. Executing on the Fargate improvement proposal, as @kwombach12 suggested, would address this.

avinavgit commented 1 year ago

@jcharcalla , In our org we are using serverless EKS it is completely fargate. When can we expect fix for this.

yomofo2s commented 1 year ago

Hi we are also getting exactly the same error as posted by @nikovacevic. In addition, we also get this WRN CostModel.ComputeAllocation: Node GPU cost query result missing field: "'instance_type' field does not exist in data result vector" for node "fargate-ip-x-x-x-x.eu-central-1.compute.internal"

AjayTripathy commented 12 months ago

Hey folks, our current EKS fargate pricing is an estimate. There is planned work to get higher accuracy. You can find the summary of challenges and planned work here:

https://docs.google.com/document/d/1VHWXJ3rOMMRSpg--TkAHTinjZbL8O4P3u3ukuIxWQnk/edit

chipzoller commented 4 months ago

Note that further Fargate support is tracked internally via KC-49. Move this to features-bugs.

csmith-poppulo commented 3 weeks ago

After reading through all of the following links I am not sure what the current status is for EKS/Fargate support with Kubecost. Can someone please clarify in simple terms for my simple mind? :)

https://github.com/kubecost/cost-analyzer-helm-chart/issues/2092 https://docs.google.com/document/d/1VHWXJ3rOMMRSpg--TkAHTinjZbL8O4P3u3ukuIxWQnk/edit#heading=h.bwsl16egxxgl https://github.com/opencost/opencost/issues/1622

As a end user I do not have access to Jira to look at the state of any of the tickets that have been mentioned related to this issue and the other linked above.

chipzoller commented 3 weeks ago

Kubecost support for Fargate can currently be summarized as basic, best effort, and incomplete support with no current plans on record to significantly extend/enhance this level of support.

nikovacevic commented 3 weeks ago

@chipzoller is correct. I'll just add a bit about why that is the case. AWS Fargate sort of "hijacks" Kubernetes and runs jobs that have different resource utilization and cost, according to AWS, than what the Kubernetes API exposes to Kubecost. It also breaks the identifiers Kubecost uses to relate those containers and nodes back to the AWS Cost and Usage Report. So Kubecost will pick up workloads and assign resources and costs, but they will be more inaccurate than usual and do not get reconciled with CURs. Hope that helps.