FinOps-Open-Cost-and-Usage-Spec / FOCUS_Spec

The Unifying Specification for Cloud Billing Data
https://focus.finops.org
Other
174 stars 39 forks source link

[SPEC CHANGE]: Is the inclusion of custom/native columns in the FOCUS dataset recommended #602

Open ijurica opened 1 week ago

ijurica commented 1 week ago

Type of Issue

Glossary Term

Normalized

Description

I assume we all expect custom/native (prefixed with x) columns, which provide information that can't be resolved from FOCUS columns, to be included in the FOCUS dataset, and not just in separate native provider cost and usage datasets. The FOCUS spec mentions that these columns must be prefixed with x, but doesn't indicate whether their inclusion is preferred.

For instance, in the case of OCI, lineItem/referenceNo, lineItem/backReference, and product/compartmentId provide valuable information, but these columns are not included in the OCI FOCUS dataset; they are only available in the native OCI Cost and Usage reports. Additionally, there is no way to reliably correlate charge records between these two datasets.

Definition of Done

Revise the FOCUS dataset glossary term to suggest that it should also include external, custom columns that provide information not covered by the FOCUS columns

Context / Supporting Information

No response

jpradocueva commented 1 week ago

TF-2 call on Oct 15:

Topic: #602 Inclusion of custom native columns in the focus dataset Discussion Summary: This issue discussed the need to enrich the focus dataset by including native columns from providers, which was assumed to be part of the dataset but has not been consistently applied by all providers (e.g., OCI, Azure). The absence of these native columns has made it difficult to correlate certain data and charge records. Resolution: The issue will be backfilled with a work item, clearly stating the problem and use case. Examples of the missing data will also be included to make it easier for stakeholders to understand the need for these columns. Once the work item is complete, it will be added to the 1.2 milestone.

jpradocueva commented 5 days ago

Summary Members' call on Oct 17:

#602 [SPEC CHANGE]: Is the inclusion of custom/native columns in the FOCUS dataset recommended? Primary Issue: The issue explores whether the specification should formally recommend the inclusion of custom or native columns in the FOCUS dataset, particularly when dealing with provider-specific data. Core Problem: While the dataset is designed for standardized reporting across providers, there is increasing demand from practitioners to include custom or native columns specific to a cloud provider’s offerings. This raises the question of how to integrate provider-specific data without compromising the integrity of the FOCUS dataset’s standardization goals. Divergent Views: Some members advocated for allowing custom columns as it provides flexibility for different provider features. Others were concerned that too much customization could lead to inconsistencies and make it harder to maintain a unified dataset across providers. Final Agreement: The group agreed to create a framework for including custom columns but emphasized the need for guidelines to prevent over-customization. The specification will recommend custom/native columns where necessary, but their inclusion will need to follow specific rules to maintain overall standardization. Action Items:

ahullah commented 17 hours ago

Also one observation to consider here as it represent another point were aggregation can get messed up e.g if we have a resource that would previously have been represented by a single row: Name, Usage Hours, , Cost Resource A, 720 , $100

and a provider wants to extend this record with a proprietary field we would expect them to recalculate the metric columns to ensure they accurately represent activity or at the least divide the metric values by the number of rows they need to add to represent the values in this new column. so this should be: Name, Usage Hours, , Cost, Role (Proprietary) Resource A, 520 , $80, Active Resource A, 200 , $20, Standby

AND NOT: Name, Usage Hours, , Cost, Role (Proprietary) Resource A, 720 , $100, Active Resource A, 720 , $100, Standby (as you can see in this case both usage hours and cost are artificially doubled)