hystax / optscale

FinOps, MLOps and cloud cost optimization tool. Supports AWS, Azure, GCP, Alibaba Cloud and Kubernetes.
https://hystax.com
Apache License 2.0
1.2k stars 168 forks source link

Different RDS values ​​when comparing AWS Cost Explorer and Optscale Cost Explorer #62

Open HugoDCileiro opened 1 year ago

HugoDCileiro commented 1 year ago

I compared the costs of services obtained from AWS Cost Explorer and Cost Explorer from OptsCale. And the value obtained by Cost Explorer from OptsCale was $1200 higher compared to the value obtained from AWS Cost Explorer. I tried to see what could have happened (values ​​in different services, a bug, error in counting, etc) but I didn't find any answer. Attached is an Excel file showing the daily comparison of the values ​​obtained and the total for each platform. The biggest difference is in the comparison of the RDS service, which is higher in the OptsCal Cost Explorer analysis of more than $800 compared to the AWS Cost Explorer. analyze cost optscale aws cost explorer.xlsx

tguisep commented 1 year ago

Hello,

Did you check network traffic ? Secondly, OptScale do not take in consideration AWS commitment plan. (should be a feature?)

Thomas.

HugoDCileiro commented 1 year ago

Hello, Yes, the VPC cost is within the difference threshold ($4.23 difference between AWS Cost Explorer analysis and Optscale Cost Explorer). On reservations, we haven't really looked into the issue of RDS instance reservations in depth. However, even if, after analysis, the cost of RDS is the same in both AWS Cost Explorer and Optscale's Cost Explorer, there are other services that give a slightly high difference, for example, in CLOUDFRONT ($83.33 difference between platforms), S3 ($59.18 difference between platforms), ES - OpenSearch ($55.41 difference between platforms), among others. Is this difference considered normal in the cost analysis between the two platforms?

tguisep commented 1 year ago

Hello,

I made a deep analysis on the cost difference.

Some point:

Based on the optscale API /raw_expenses I made a external script using a rewrite version of this parser: https://github.com/hystax/optscale/blob/52ca25d758ce2bc9e0bb4c38f539d79a4d6d5351/diworker/diworker/importers/aws.py#L565

As:

def clean_expenses_for_resource(resource_id, expenses, ):
    clean_expenses = {}

    edp = "YOUR EDP VALUE: Eg 0.05"

    for e in expenses:
        start_date = _datetime_from_expense(e, 'lineItem/UsageStartDate')
        end_date = _datetime_from_expense(e, 'lineItem/UsageEndDate')

        # end date may point to the 00:00 on the next day,
        # so to avoid confusion removing one second
        end_date -= timedelta(seconds=1)
        days = (end_date - start_date).days + 1

        for d in range(days):
            date = start_date + timedelta(days=d)
            day = date.replace(hour=0, minute=0, second=0, microsecond=0).timestamp()
            if day in clean_expenses:
                if 'discount/EdpDiscount' in e and 'lineItem/BlendedCost' in e:
                    clean_expenses[day]['discount'] += float(e['lineItem/BlendedCost']) * edp / days
                    clean_expenses[day]['cost'] += float(e['lineItem/BlendedCost']) / days
                else:
                    clean_expenses[day]['cost'] += float(e['cost']) / days
            else:

                if 'discount/EdpDiscount' in e and 'lineItem/BlendedCost' in e:
                    discount = float(e['lineItem/BlendedCost']) * edp / days
                    cost = float(e['lineItem/BlendedCost']) / days
                else:
                    discount = 0.0
                    cost = float(e['cost']) / days

                clean_expenses[day] = {
                    'discount': discount,
                    'date': day,
                    'cost': cost,
                    'resource_id': resource_id,
                    'cloud_account_id': e['cloud_account_id']
                }

    return clean_expenses

Parsed on ~10k ressources (little bit everything) with 20+ accounts on a entire month, I got a result accurate close of 100%. Some cents differences on few accounts vs cost explorer.

@sd-hystax @HugoDCileiro

tguisep commented 1 year ago

Secondly TVA is billed per day within Optscale. (total tva / days) On AWS cost explorer, the first day of the month.

jakelima18 commented 1 year ago

Hi, how you define your EDP value @tguisep? How I get this information.

tguisep commented 1 year ago

It's linked to the contract of your organization with AWS. 0.05 = 5%

jakelima18 commented 1 year ago

I would have to add that function developed inside this file like optscale/diworker/diworker/importers/aws.py or I need to do some other modification. @tguisep

maxb-hystax commented 1 year ago

Yes, it will be optimal to apply discount there, because diworker aggregates costs from raw expenses and puts them into expenses table in clickhouse, which is widely used for other aggregations. You can hardcode it for your private usage, but for proper implementation discount value should be property of cloud account.

tguisep commented 1 year ago

@jakelima18 I didn't make fix on the current Optscale code, I made my own external expense parser based on Opscale code/API (kind of BI module).

I'm waiting for a official proper implementation, because as notified by @maxb-hystax , implementing this fix on the code is optimal but on my side I do not have enough knowledge on the global code to do it properly and understand all the potentials impact linked to this change.

But, even if the values reported by Opscale are not perfectly sames than the AWS billing, it's accurate enough to give good idea on the spends.