open-telemetry / build-tools

Building tools provided by OpenTelemetry
https://opentelemetry.io
Apache License 2.0
37 stars 54 forks source link

Improve markdown table generation for metrics #129

Open arminru opened 1 year ago

arminru commented 1 year ago

(split from #79)

See https://github.com/open-telemetry/build-tools/pull/79/files#r996803040 for the previous discussion and proposals.

Two possible options would be:

  1. A table of metric names with a (then shared) table of attribute names that apply.
  2. A table of the metric where "empty" rows are created while attributes are expanded.

Both of them are currently manually applied in the hand-crafted metrics tables in the spec repo.

cc @jamesmoessis @Oberon00 @joaopgrassi @jsuereth

jamesmoessis commented 1 year ago

Hey @arminru - is anyone working on this? If not, you can assign to me and I can try get to it in the next few weeks

arminru commented 1 year ago

Not yet! Thank you, @jamesmoessis 🙂

jamesmoessis commented 1 year ago

I've written 3 options down below of how we might render the tables. I encourage everyone to post their thoughts and preferred option. Once we have a general consensus, I will go ahead and implement the chosen option.

Option 1

Option 1 consists of metrics tables containing multiple metrics, and one attributes table for each individual metric. Option 1 was proposed by @joaopgrassi in previous discussions.

## foo metrics
<!-- semconv metric.foo(metric_table) -->
| Name           | Instrument Type | Unit (UCUM) | Description                   |
|----------------|-----------------|-------------|-------------------------------|
| `foo.size`     | Histogram       | `{bars}`    | Measures the size of foo.     |
| `foo.duration` | Histogram       | `{bazs}`    | Measures the duration of foo. |
<!-- endsemconv -->

### Attributes for `metric.foo.size`
<!-- semconv metric.foo.size -->
| Attribute          | Type   | Description                    | Examples              | Requirement Level                                             |
|--------------------|--------|--------------------------------|-----------------------|---------------------------------------------------------------|
| `http.method`      | string | HTTP request method.           | `GET`; `POST`; `HEAD` | Required                                                      |
| `http.status_code` | int    | [HTTP response status code](). | `200`                 | Conditionally Required: if and only if one was received/sent. |
<!-- endsemconv -->

### Attributes for `metric.foo.duration`
<!-- semconv metric.foo.duration -->
| Attribute     | Type   | Description          | Examples              | Requirement Level |
|---------------|--------|----------------------|-----------------------|-------------------|
| `http.method` | string | HTTP request method. | `GET`; `POST`; `HEAD` | Optional          |
<!-- endsemconv -->

Option 2

Option 2 consists of a single-row metric table for each metric, with attributes listed below it. This is what currently exists in build-tools.

Similar to option 1, but each metric has a single-row table with the attributes table for that metric directly after it. Option 2 creates more tables and is less concise, but it provides the advantage that the attributes table is always next to the metric definition, rather than being down the page like in option 1.

This option is similar to what is in the collector's generated metrics tables. Example.

## `foo.size`
<!-- semconv metric.foo.size(metric_table) -->
| Name       | Instrument Type | Unit (UCUM) | Description               |
|------------|-----------------|-------------|---------------------------|
| `foo.size` | Histogram       | `{bars}`    | Measures the size of foo. |
<!-- endsemconv -->

### Attributes for `foo.size`
<!-- semconv metric.foo.size -->
| Attribute          | Type   | Description                    | Examples              | Requirement Level                                             |
|--------------------|--------|--------------------------------|-----------------------|---------------------------------------------------------------|
| `http.method`      | string | HTTP request method.           | `GET`; `POST`; `HEAD` | Required                                                      |
| `http.status_code` | int    | [HTTP response status code](). | `200`                 | Conditionally Required: if and only if one was received/sent. |
<!-- endsemconv -->

## `foo.duration`
<!-- semconv metric.foo.duration(metric_table) -->
| Name           | Instrument Type | Unit (UCUM) | Description                   |
|----------------|-----------------|-------------|-------------------------------|
| `foo.duration` | Histogram       | `{bazs}`    | Measures the duration of foo. |
<!-- endsemconv -->

### Attributes for `foo.duration`
<!-- semconv metric.foo.duration -->
| Attribute     | Type   | Description          | Examples              | Requirement Level |
|---------------|--------|----------------------|-----------------------|-------------------|
| `http.method` | string | HTTP request method. | `GET`; `POST`; `HEAD` | Optional          |
<!-- endsemconv -->

Option 2b

As @Oberon00 mentioned in the previous PR, the single-row table could be expressed as a list of key-value pairs instead of a table, e.g.:

Option 3

Option 3 consists of a metric table where "empty" rows are created while attributes are expanded.

I didn't see this option fully fleshed out, so for those who mentioned it (mainly @jsuereth) I will do my best to provide an example of how I imagine it might look. Feel free to correct me and I can edit this comment.

I've made it so the attribute name and requirement level are specified with the metric table, but then link to a full attribute definition. The standalone attribute definition only needs to be defined once for all metrics, because the requirement level is specified on the metric table already.

This option is the most concise because it avoids repeatedly describing attributes.

<!-- semconv metric.foo.size(metric_table) -->
| Name       | Instrument Type | Unit (UCUM) | Description               | Attribute                                   | Attribute Requirement level                                   |
|------------|-----------------|-------------|---------------------------|---------------------------------------------|---------------------------------------------------------------|
| `foo.size` | Histogram       | `{bars}`    | Measures the size of foo. |                                             |                                                               |
|            |                 |             |                           | [`http.method`](#attribute-definition)      | Required                                                      |
|            |                 |             |                           | [`http.status_code`](#attribute-definition) | Conditionally Required: if and only if one was received/sent. |

...

### Attribute definitions
<!-- semconv metric.foo -->
| Attribute          | Type   | Description                    | Examples              |
|--------------------|--------|--------------------------------|-----------------------|
| `http.method`      | string | HTTP request method.           | `GET`; `POST`; `HEAD` |
| `http.status_code` | int    | [HTTP response status code](). | `200`                 |
<!-- endsemconv -->
jamesmoessis commented 1 year ago

Here are rendered versions of the above options.

Option 1

foo metrics

Name Instrument Type Unit (UCUM) Description
foo.size Histogram {bars} Measures the size of foo.
foo.duration Histogram {bazs} Measures the duration of foo.

Attributes for metric.foo.size

Attribute Type Description Examples Requirement Level
http.method string HTTP request method. GET; POST; HEAD Required
http.status_code int [HTTP response status code](). 200 Conditionally Required: if and only if one was received/sent.

Attributes for metric.foo.duration

Attribute Type Description Examples Requirement Level
http.method string HTTP request method. GET; POST; HEAD Optional

Option 2:

foo.size

Name Instrument Type Unit (UCUM) Description
foo.size Histogram {bars} Measures the size of foo.

Attributes for foo.size

Attribute Type Description Examples Requirement Level
http.method string HTTP request method. GET; POST; HEAD Required
http.status_code int [HTTP response status code](). 200 Conditionally Required: if and only if one was received/sent.

foo.duration

Name Instrument Type Unit (UCUM) Description
foo.duration Histogram {bazs} Measures the duration of foo.

Attributes for foo.duration

Attribute Type Description Examples Requirement Level
http.method string HTTP request method. GET; POST; HEAD Optional

Option 2b

Option 3

foo.size

Name Instrument Type Unit (UCUM) Description Attribute Attribute Requirement level
foo.size Histogram {bars} Measures the size of foo.
http.method Required
http.status_code Conditionally Required: if and only if one was received/sent.

...

Attribute definitions

Attribute Type Description Examples
http.method string HTTP request method. GET; POST; HEAD
http.status_code int [HTTP response status code](). 200
joaopgrassi commented 1 year ago

With Option 1 you could also have a column that has a link to its table of attributes (anchor to the Attributes for foo.size section). That would make it a bit easier to navigate in case the metrics table gets long.

Option 3 is also a good approach but what I dislike is the potential "back-and-forth" that one might need to go in order to read the attributes. A side question to this: Would this mean we could have attributes in a central file and then link several metrics to it? I get how that's good for maintainability but for reading and consuming.. I think it's not good.

I'm probably biased, but I feel Option 2 (status-quo) is clear and good, apart from the downside of having duplication... but given it's all machine generating this I wonder is that even a problem?

Oberon00 commented 1 year ago

Please also consider moving away from tables entirely where possible, see https://github.com/open-telemetry/opentelemetry-specification/issues/1925

jsuereth commented 1 year ago

I think I like option 3 the best, but could we include the attribute descriptions in the original table?

Like @joaopgrassi I'm worried if folks need to jump up/down a lot this will cause frustration using the data, regardless of how we store it in YAML.

Option 2 (if we take less visual separation for attributes or blend 2 + 2b) also appeals to me.

Josh's crazy suggestion

What about allowing for several of these with LINKAGE.

This would give you the best of "let me see all the metrics produced" and "let me see all the details of a metric at once".

jamesmoessis commented 1 year ago

Good points all around, thank you all.

In my opinion, Option 3 creates a very wide table which may not render nicely in a lot of cases, particularly if the attribute description is added too. There's also a lot of unused space due to the blank cells on the table.

I like @jsuereth's idea of using a table of contents to list the metrics, and linking to more detail on the metric. I like combining this with option 2b. I believe this gives the best of everything. An easy, readable overview of all metrics, whilst also being able to quickly drill down into a specific metric where its attributes are listed and described.

joaopgrassi commented 1 year ago

@jamesmoessis how would you list the attributes in option 2b?

jamesmoessis commented 1 year ago

@joaopgrassi the same as option 2. Option 2b the metric table changes to be a list of key-values, and the attribute table would be right below that.

jamesmoessis commented 1 year ago

I do wonder how we might generate the ToC in the correct order of appearance on the page. We don't generate the whole page and there are significant free-form sections, and each metric will be added individually via the <!-- semconv --> tags; so there's not currently a way for one piece of rendered markdown to know the contents and order of the rest of the page.

Generating a table of contents would likely need to be a post-processing step which looks at the markdown headers of the already rendered page.

joaopgrassi commented 1 year ago

the same as option 2. Option 2b the metric table changes to be a list of key-values, and the attribute table would be right below that.

I see. That is probably the best of both worlds then to me :)

Generating a table of contents would likely need to be a post-processing step which looks at the markdown headers of the already rendered page.

Wouldn't you: 1. run the markdown generation, which will add all the headers and then you run the TOC?