open-telemetry / opentelemetry-specification

Specifications for OpenTelemetry
https://opentelemetry.io
Apache License 2.0
3.76k stars 891 forks source link

How to reflect experimental features to spec compliance matrix #3976

Open jack-berg opened 8 months ago

jack-berg commented 8 months ago

As discussed in the 4/2/24 Spec SIG, the spec compliance matrix is ambiguous in terms of how to represent experimental features:

By representing this information, we can use the spec compliance matrix to track implementations of experimental features, which are a prerequisite to stabilization. Additionally, we can convey to users that stability of a particular feature in a language.

jack-berg commented 8 months ago

Related comment from @carlosalberto: https://github.com/open-telemetry/opentelemetry-specification/pull/3984#issuecomment-2037422772

A good question is whether this should be updated once a feature has been released as well.

What I'd like to see:

codefromthecrypt commented 4 months ago

I would like to see a description section on notable language patterns of how to address pre-supported specs (e.g. those using experimental or otherwise gates, but having code available). These could be one or two word summaries that link-out to a language SDK repo.

After that description, I would like to see a chart of one line per spec per language which is a summarizing status. This would include both a representation of WIP (even if only a PR), and a link to where to track further (README, PR, tracking issue, etc).

The combination of both would allow people to quickly gauge how finished a spec they may rely on in a SIG is, even if they may not know if it is 6 months a year or longer out. Importantly, the links can help them contribute towards completion.


Currently, the compatibility matrix has no summary table, which means you can't tell a notion of work in progress independent from no progress. Nor can you tell, yet where to look for status for the same reason.

We also lack a description of language-specific norms, such as some having experimental or incubator, where as others implement a spec via integrating PRs

The above combine into a situation where new contributors or those unfamiliar with this have a hard time assessing the landscape, especially overall status concerns that lead to promotion (such as 3 languages implement X). Case in point is the LLM SIG, which recently relies on the Event/s API, but has no clear way to indicate to users how WIP anything is at this point. To that end, I've attempted to replicate a status including where to participate in a tracking issue, but think it should be top-level and part of norms instead.

To me, this is not solely about knowing for example that there is work at all on certain APIs like Events, but also how to help. So, that's why I'm not just asking for a new character, but one allowing a link. I would go so far that in a cool world, the completed status also have links as it is not always intuitive where to look, but want to start with this as it has come up twice for me (event api, file based config).

reyang commented 2 months ago

We had a discussion in the 9/25/2024 TC meeting. The direction we would like to see: use YAML to capture things in structure and generate the matrix and docs using tools + automation.

Breaking it down, these are what we need:

  1. A YAML file in the specification repo, this file should capture all the spec requirements, and for each requirement, it should have a status (e.g. Stable, Development, etc.).
  2. Each language implementation repo should have a YAML file, which captures the features and the implementation status (e.g. the feature is not implemented, is implemented as an experimental feature, is a stable feature that has been released, etc.).
  3. A CLI tool which can extra the info from these YAML files and generate compliance matrix, docs on opentelemetry.io and the status page for each language implementation.
  4. We should retire the current project status page for each language implementation (e.g. https://github.com/open-telemetry/opentelemetry-rust?tab=readme-ov-file#project-status) and replace it with the better version that we generated from the tools.

I'm willing to tackle item 1). We can use this issue to socialize the idea and who is willing to cover 2) 3) and 4).

svrnm commented 2 months ago

@reyang I made you the sponsor of this because you said you will tackle item 1). Re-assign sponsorship/assignee based on who is going to tackle the other items.

kaylareopelle commented 1 month ago

This has been on my mind! Ruby has experimental support for metrics and logs, however, we don't have a great way to communicate what features have been implemented. In addition, some features have been released, others are in progress/under review.

As a first step, I transformed a table I've been using in a Google Sheet to track metrics compliance into a markdown table. The rows in the table are copied from the rows in the spec compliance matrix. The GitHub column represents either an issue or a PR related to that line item.

I don't think this is quite the right solution, but perhaps it gives us something to build from.

https://github.com/open-telemetry/opentelemetry-ruby/pull/1746

I really like the idea of using yaml to track this, it seems much more maintainable than a markdown table.

svrnm commented 1 month ago

Happy to help with creating the YML files, I did some minor experiments with that a while back, here is a YML file that represents the current spec matrix (I used some LLM to provide me scripts for a bi-directional conversation, I can share them as well if needed):

https://gist.github.com/svrnm/eca2bd00a5c438940cc979481867ba86

If I understand @reyang's comment correctly this is not exactly what we are looking for, right? But we can use the same approach to decompose the current matrix as outlined above, or would it be more reasonable to start from scratch?

marcalff commented 1 month ago

I very much support the idea of a yaml file.

On top of all the benefits already mentioned, there is another one which is critical in my opinion:

The status for a given SIG and for a given feature is represented by its own line.

This makes git blame usable, to find when a status did change, when looking at history. Currently, with one line representing statuses for every SIG at once, git blame is useless.