FinOps-Open-Cost-and-Usage-Spec / FOCUS_Spec

The Unifying Specification for Cloud Billing Data
https://focus.finops.org
Other
186 stars 39 forks source link

SkuCapacity / SkuCapacityUnit #320

Open marc-perreaut opened 10 months ago

marc-perreaut commented 10 months ago

Description

It's interesting to have a high-level capacity or "footprint" view for discussions with leaders and for cost analysis. The capacity metric is for example VCPU for VMs, TB for storage, GB/s for network (with split inbound/outbound when makes sense), whatever the SKU. Here are examples:

  • How many CPUs / TBs do we have overall / in this region / in this application? What the regions / applications with most CPUs / TBs ?
  • Can we have the capacity details by VM series/family?
  • What's the cost impact if we switch from this VM series/family to that VM series/family?
  • How much % of the CPUs (capacity) have we committed?

Proposed approach

An approach is to add columns SkuCapacity and SkuCapacityUnit to provide the capacity information, which can be seen as a high-level attribute of the SKU. The SkuCapacityUnit should be standardized (whitelist of allowed values), so that SKU capacity can be aggregated and reported cross SKUs and cross providers, by SkuCapacityUnit.

Github issue or Reference

I am not aware of any existing Github issue related to this topic.

Context

Capacity information can be calculated by parsing SKU information, which is provider specific. It would bring value to get the capacity information natively available for practitioners. Note: practitioners would need to take into account the SkuCapacityUnit to calculate time-averaged capacity.

sireeshaoram commented 10 months ago

By focusing on capacity metrics like vCPUs, TB, and GB/s, organizations can make informed decisions about scaling resources up or down based on the demands of their applications. This contributes to performance optimization and cost-efficiency.

hrishikeshsardar commented 10 months ago

SkuCapacity/SkuCapacityUnit are definitely important for cross referencing configurations and the cost among the cloud service providers. This will help practitioners or consultants in recommending best suited solution for the project/application.

Happy to provide more context in upcoming meetings.

ahullah commented 9 months ago

Would you think of this as a core part of the data set, or as a reference table that sits alongside the main data set for dynamic enrichment?

hrishikeshsardar commented 9 months ago

@ahullah, adding columns SkuCapacity and SkuCapacityUnit to the core data set (detailed spec) makes sense for me as a practitioner. It helps in many scenarios like analyzing on-demand vs commitment to derive scope further.

These two columns alone might not be enough to do the analysis, it require other columns related that identifies charge type alongside quantity billed on-demand vs commitment.

flanakin commented 9 months ago

Aren't there scenarios where there would be multiple types of capacity? We've previously talked about adding SkuDetails column that would be JSON and could have whatever SKU-specific attributes are needed.

marc-perreaut commented 8 months ago

Aren't there scenarios where there would be multiple types of capacity? We've previously talked about adding SkuDetails column that would be JSON and could have whatever SKU-specific attributes are needed.

Probably, so the SkuCapacity would be the main capacity type. For example, a VM has a capacity in CPU and in memory: the main capacity would be the CPU. One can argue that the main capacity is memory for memory-bound workloads, for example databases. If happens, both CPU and memory would be needed as capacity: new columns SkuCapacity2 and SkuCapacity2Unit could be added, so that both CPU capacity and memory capacity are present in the dataset. Or maybe something smarter?

A SkuDetails JSON column could do the job, if the keys SkuCapacity and SkuCapacityUnit are always present whatever the SKU (as the goal is to have an aggregated, high-level capacity view), but I find it less convenient than dedicated columns from a practitioner perspective, as it implies to decode the JSON.

sireeshaoram commented 8 months ago

Consider a hybrid approach would be a smarter approach here to Use separate columns for primary capacity (CPU) and secondary capacity (memory). If both capacities are needed, populate the relevant columns. If only one capacity is relevant (e.g., memory-bound workload), leave the other column empty. This way, we can maintain clarity while accommodating different scenarios

shawnalpay commented 1 month ago

@marc-perreaut @ijurica @kk09v Would it be fair to argue that this issue was addressed by the creation of a SKUPriceDetails JSON column in 1.1? Or is there still utility to carrying this issue forward and advocating for a dedicated set of columns?

marc-perreaut commented 1 month ago

@marc-perreaut @ijurica @kk09v Would it be fair to argue that this issue was addressed by the creation of a SKUPriceDetails JSON column in 1.1? Or is there still utility to carrying this issue forward and advocating for a dedicated set of columns?

The SKU Price Details relates to a single SKU Price ID, whereas the intent here is to address all SKU Price IDs that share a common high-level capacity unit like CPU, for practitioners to easily questions high-level questions like "how many CPUs do we have in this region?". Such questions can be answered by translating SKU Price Details into this high-level units, but this requires effort on practitioner side to address all SKU Price IDs and all providers.

Just brainstorming, I wonder whether SKU Capacity Units could be attached conceptually to SKU Categories: 1 SKU Category = 1 SKU Capacity. That could entertain the discussion about SKU categories.

ijurica commented 1 month ago

@marc-perreaut @ijurica @kk09v Would it be fair to argue that this issue was addressed by the creation of a SKUPriceDetails JSON column in 1.1? Or is there still utility to carrying this issue forward and advocating for a dedicated set of columns?

SkuPriceDetails JSON column should provide all those information, but since it is a provider-specific column, practitioners still need to parse it, identify and map those 'keys' across different providers. I assume we'll continue discussing, prioritizing, and introducing additional prominent SKU and SKU Price columns, and that the issues/requirements discussed in this PR will also be taken into consideration