Open cnharris10 opened 1 month ago
#540 [DISCUSSION]: Tags Column Definition and User-Defined Tags Key Discussion Items: The discussion focused on the requirement that user-defined tags cannot be altered, which could lead to issues with normalization and denormalization. Problem Identification: Some cloud providers (AWS, specifically) have multiple tag schemes, leading to complications in enforcing a strict tag policy. Divergent Views: The group debated whether changing the definition would introduce a breaking change. Final Agreement: Chris will create a work item to formalize the issue, advocating for a potential change in the 1.2 release. Action Items: [TF-2-#540] Chris, @cnharris10 will handle creating the work item for this issue.
@cnharris10 Spent some time with this one. I get it now! But I have some feedback. :)
Work Item
instead of a Discussion Topic
: I recommend you change the title to be an action on a concept rather than a problem or solution statement. Something like Support provision of multiple user-defined Tag systems
.Analyze cost and usage by multiple tag structures
.Providers MUST NOT alter user-defined Tag keys or values.
) would be good to encapsulate with quotes or quote markdown or similar. It's currently not clear where the quote ends and your commentary begins.If you feel this level of detail is unnecessary and/or I'm being pedantic, I can appreciate that -- but our audience for these issues is expanding beyond the FOCUS project team, and any/all context will be helpful for someone getting up to speed on this (even a dense Maintainer such as myself!).
#540 [DISCUSSION]: Tags Column Definition Mandates that User-Defined Tags are Not Altered, Which Can Lead to Various Scenarios Primary Issue: This discussion revolves around the current requirement in the specification that user-defined tags must not be altered. The concern is that this rule could lead to complications when practitioners deal with multiple user-defined tag schemes from different providers. Core Problem: AWS, for instance, allows both user-defined resource tags and user-defined cost categories, which could result in conflicts when both types of tags share the same key names. The current specification does not adequately address how to differentiate these multiple tag schemes without altering the user-defined tags. Divergent Views: Some members felt that allowing providers to prepend a prefix to user-defined tags could resolve the issue without altering the tags themselves, while others expressed concern that introducing prefixes would increase complexity and make tag management harder for practitioners. There was also debate about whether this could be considered a breaking change. Final Agreement: The group agreed to explore solutions that would allow providers to prepend a prefix for certain user-defined tag schemes (e.g., cost categories) without altering other user-defined tags. This Issue #540 represents the first “work item” to prepare. However, this should be carefully reviewed to ensure that it doesn’t introduce complexity or conflicts for practitioners. This Issue #540 represents the first “work item” to be prepared by the group. Action Items:
I use GCP and Azure and in GCP we have labels and tags, in both labels and tags we have some matching keys. In our FOCUS dataset we dont have any issues in showing the key values from both labels and tags. Might need to do some more investigation into this one.
@thecloudman Interesting; thanks for sharing.
@cnharris10 @AWS-ZachErdman Do we have real-world examples of this happening, and if so, could you share? It may be difficult to get the stakeholders to prioritize this one if it's not perceived to be a problem.
@shawnalpay
AWS_CostCategories
)@thecloudman
A couple questions:
For GCP exports, are you saying that when you have a tag, foo:bar
, and a label: foo:baz
, you are fine with (non-deterministically) the Tags
column manifesting as either {"foo": "bar"}
or {"foo": "baz"}
and losing the other entry?
To mitigate this clobbering issue, AWS supplies user-defined resource tags within the Tags
column and also creates a provider-defined column, AWS_CostCategories
, that encapsulates their other user-defined (Cost Category) tags. This ensures that the example from the previous question doesn't occur.
If providers follow this approach, then providers will encapsulate some user-defined tags under the standard Tags
column and the rest under 1 or more provider-based columns (ex: x_MyOtherTags
). In this case, with 3 hypothetical providers going this route (Provider1, Provider2, Provider3), 4 columns will be produced causing practitioners to look/query across various, non-normalized columns for user-defined tags.
Example:
Tags: { "foo": "bar" }
x_Provider1_OtherUserDefinedTags: { "foo": "bar2" }
x_Provider2_OtherUserDefinedTags: { "foo": "bar3" }
x_Provider3_OtherUserDefinedTags: { "foo": "bar4" }
The intent of the Tags
column for 1.0 was to encapsulate all tags under one column to allow an easy querying experience regardless of provider
I just tested this with our own data and there is potential collision if their a multiple mechanisms that are resulting in keys that are the same. In the event that the provider has a multiple systems that provide key and values in the tags column then they either need to:
I also see the need for this.
The spec allowing for namespacing to avoid these collisions seems like the preferable approach here.
An oversight in the specification, and we need to resolve it.
@cnharris10 this is mainly a problem with respect to cost categories having it's own column and should not be related to the gap that we listed in our user guide for our preview specification.
The most compelling problem explanation and argument for me about why we should reconsider this definition is the argument you gave here:
If providers follow this approach, then providers will encapsulate some user-defined tags under the standard Tags column and the rest under 1 or more provider-based columns (ex: x_MyOtherTags). In this case, with 3 hypothetical providers going this route (Provider1, Provider2, Provider3), 4 columns will be produced causing practitioners to look/query across various, non-normalized columns for user-defined tags.
Example:
Tags: { "foo": "bar" } x_Provider1_OtherUserDefinedTags: { "foo": "bar2" } x_Provider2_OtherUserDefinedTags: { "foo": "bar3" } x_Provider3_OtherUserDefinedTags: { "foo": "bar4" } The intent of the Tags column for 1.0 was to encapsulate all tags under one column to allow an easy querying experience regardless of provider
1. Problem Statement *
The Tags column currently says: Providers MUST NOT alter user-defined Tag keys or values. In cases where a provider has multiple user-defined tagging features that allow for the same user-defined tags to be created, but partitioned by feature, this will require at least N-1 user-defined features to require some prefix in order to prevent clobbering.
For example, AWS has user-defined both resource tags and user-defined cost categories. If a customer defines a user-defined resource tag as foo:bar and a cost category as foo:baz, then persisting both in the Tags column key/value map will cause clobbering (i.e. either "bar" or "baz" will persist, not both). The same case can occur between GCP tags and labels.
2. Objective *
All user-based or provider-based tags are encapsulated within the
Tags
column with predefined prefixes preventing clobbering for at least N-1 tagging schemes.3. Supporting Documentation *
Original Tags column definition for FOCUS 1.0: https://github.com/FinOps-Open-Cost-and-Usage-Spec/FOCUS_Spec/pull/227 Use Case: Analyze cost and usage by multiple tag structures without guessing which columns contain various tags
4. Proposed Solution / Approach
In the proposed approach, using the AWS CUR as an example, the following tags are considered:
User-defined Tags:
foo:bar
(i.e.resourceTags/user:foo
with valuebar
)foo:bar3
(i.e.costCategories/foo
with valuebar3
)Provider-defined Tag:
foo:bar2
(i.e.resourceTags/aws:foo
with valuebar2
)The proposal is to amend the
Tags
column to allow a user-defined prefix to be concatenated with a finalized user-defined tag key for N-1 user-defined tagging schemes. This allows for 1 tagging scheme to remain without a user-definedprefix
, so practitioners can reference a user-defined tagging schema without a prefix.With the tags supplied above, all
Tags
can be co-located as either:Option 1: Predefined prefix declared for N-1 user-defined and all provider tags Provider declares prefix:
costCategories
for user-defined cost category tags andaws
for provider-defined system tags.Tags: { "foo": "bar", "aws:foo": "bar2", "costCategories:foo": "bar3" }
Option 2: Prefix declared for all user-defined and all provider tags Provider declares prefix
user
for user-defined resource tags, prefix:costCategories
for user-defined cost category tags, andaws
for provider-defined system tags.Tags: { "user:foo": "bar", "aws:foo": "bar2", "costCategories:foo": "bar3" }
5. Epic or Theme Association
TBD
6. Stakeholders *
TBD