Closed j0nnyr0berts closed 3 years ago
total_costs include DBU and compute, is that what you're trying to do. Look at the data model, total costs == compute + dbu. Also, make sure your costs reflect the truth, review custom costing section in docs.
Overwatch costs are not exact, they are estimates. They should reflect closely but are unlikely to "match" as they are derived, not looked up from true cost tables
Thanks! Costs aside, do you know if there's any way to reconcile compute hours between the accounts usage stats and Overwatch? I've attached the output for a specific job from both. Trying to understand what 'machineHours' in the accounts page relates to (if anything!) in Overwatch.
account_usage_job_70245.csv overwatch_job_70245.csv
Also, I notice that in Overwatch runs for the current day, automated externally triggered jobs are missing. Is this due to the fact that audit logs are only delivered daily?
What cloud platform do you use?
AWS
You're correct, logs are delivered once a day. As for the machine hours, Overwatch provides core_hours. You would need to look at core_hours / number of cores on the machine which will result in aggregate machine hours for the cluster across all nodes All that info is available in the 'ClusterStateFact' table.
Thank you!
Hi, I'd like to use Overwatch to monitor Databricks spend during the day. I've successfully run the modules required to generate the
clusterstatefact
table, however when I try and compare any of aggregated costs against the account usage page, they have a similar shape to, but differ considerably in, $ value.I am matching the
interactiveDBUPrice
/automatedDBUPrice
compute cost variables in Overwatch with theAll purpose compute
andjobs compute
inputs to the account pricing section options in Account usage.I have tried comparing the output the Overwatch and account usage for the same job id run but am able to reconcile any of the Overwatch column to the
dbus
ormachineHours
in the account output.It would be great to know if it's possible to generate the same output as the account usage cost estimation using overwatch so that I may get an hourly view. Failing that, it would be great to understand any differences between the two outputs!
For context, I would naively expect the daily sum of total_cost in overwatch to be close to the estimated daily $ I see in the Accounts Usage page. However, the Overwatch value is much higher - often over 2x.