open-telemetry / semantic-conventions

Defines standards for generating consistent, accessible telemetry across a variety of domains
Apache License 2.0
256 stars 165 forks source link

Decide how to organize "sub namespaces" on registry YAML model files #887

Open joaopgrassi opened 6 months ago

joaopgrassi commented 6 months ago

Context

In some cases, the yaml model file in the attributes registry contains multiple "levels" of attributes. One example is the Database one: https://github.com/open-telemetry/semantic-conventions/blob/main/model/registry/db.yaml.

The top id is registry.db, and all attributes go into that. Since for databases, there's multiple db systems, each them have that appended to the id, like cassandra.* or mongodb.*.

When generating the markdown for these attributes in the registry, we rely on tags to render the individual db system attribute tables, like <!-- semconv registry.db(omit_requirement_level,tag=db-generic) -->.

Problems with this approach:

An alternative to this

Instead of relying in tags, in the model for the registry we can simply organize each individual group under it's own id. For example:

Pros of this option

An example of this approach can be found in this PR: https://github.com/open-telemetry/semantic-conventions/pull/848/files#diff-3efbd7bfaa9b1122d4421e83e19833ead514f4c41ef2c72450bb8abc725f35e1

What to do

We need to decide how we want to go forward and make it consistent across the repo.

AlexanderWert commented 6 months ago

I like the proposal!

We need to have meaningful guidelines on the following though:

trisch-me commented 6 months ago

Maybe we should add it to the guidelines? So new contributions will follow the process and semantic meaning of splitting the groups?

trisch-me commented 6 months ago

For the second option there will be no defined registry.db group. So we will not be able to generate list of all db attributes without grouping if need arises. Using tags this will be possible, but I'm not sure if this case is relevant

joaopgrassi commented 6 months ago

@trisch-me registry.db is already defined today, and it contains the general attributes :).

Maybe we should add it to the guidelines? So new contributions will follow the process and semantic meaning of splitting the groups?

Yeah once we agree I will add to the guidelines.

trisch-me commented 6 months ago

Yes it is defined and has all sub attributes under it, where grouping is happening through tags. So generic attributes are having tag db-generic If we will change it to the different ids, we will not have all attributes under main category. I'm not against second option. I just want to bring it to our attention that in that case generation of all sub attributes for given main category (db, aws, process etc) will not be possible (or I'm not aware how to do so)

lmolkova commented 6 months ago

I'd prefer to focus on the markdown and the final representation of the attributes. So far the yaml organization was not important.

Authors can split into subgroups, or use one group with tags when it helps them produce better markdown.

If we see that some groups became too big and we'd like to change it - let's do it, but I don't understand the benefit of having any rigid guidelines on yaml organization unless we need it for something very specific (like auto-generating registry).

lmolkova commented 5 months ago

I think we can provide soft-guidance (e.g. in contrib.md?) to use yaml-group per table to be rendered in the MD.

E.g.:

Usually it'd mean that system-specific attributes should be defined in the individual groups. Since registry will be auto-generated, tags will be useless and it all will prevent registry groups from growing up too much.

See #952 for the implementation on db/messaging.

joaopgrassi commented 5 months ago

So I think then the initial idea of using groups of general + specific attributes is the way to go. I will try to add some guidance on the docs for this. Assigning to me.