Open TylerHelmuth opened 1 year ago
I think https://github.com/open-telemetry/opentelemetry.io/issues/3554 is a great idea, thanks for opening that!
Is the warning in https://github.com/open-telemetry/semantic-conventions/tree/main/docs/http#semantic-conventions-for-http still accurate?
yes (was there something that led you to think it isn't accurate, so we can fix that?)
Is the Oct 1st 2023 deadline still applicable (meaning Instrumentation Libraries are free to drop the env var support with the next stable release?
yes, Instrumentation Libraries are free to drop the env var support in the next stable release now (but still should have implemented support for it in the current major version), I just sent #541 to remove mention of Oct 1 from the warning
As a user I want to see exactly what values changed, what was dropped, what was added, etc.
this was the goal of https://opentelemetry.io/blog/2023/http-conventions-declared-stable/#summary-of-changes, are you looking for something more than that, or just better discoverability of that information (i.e. https://github.com/open-telemetry/opentelemetry.io/issues/3554)?
I'd also want to see which languages support the new values
maybe we should introduce a compliance matrix in the semantic conventions repo similar to https://github.com/open-telemetry/opentelemetry-specification/blob/main/spec-compliance-matrix.md?
any important deadlines
can you expand on what you have in mind? I'm not sure of any deadlines (other than Oct 1 which has passed now)
yes (was there something that led you to think it isn't accurate, so we can fix that?)
My general confusion of the doc is that as a reader I'm not sure if it is applicable to me. It gives a warning about using v1.20.0
or prior, but to be able to answer that I have to know what my instrumentation is using. That's all possible to research, but it isn't something I expect the average OTel consumers to know. Ultimately it feels like a message for the SDK maintainers, and not something for an end user to consume (which is fine). But if an end user finds it I think it creates confusion. As an end user all I want to see is how the SDK I'm using is handling the breaking change.
https://opentelemetry.io/blog/2023/http-conventions-declared-stable/#summary-of-changes is what I'm looking for, but in a more permanent/discoverable place. My worry is that years down the road when some enterprise bumps their java/dotnet instrumentation from 1.0 to 2.0 they need to be able to find these changes or they'll feel pain. Of course we'd hope that they'd take advantage of the env var while in 1.0, but who knows.
maybe we should introduce a compliance matrix in the semantic conventions repo similar to https://github.com/open-telemetry/opentelemetry-specification/blob/main/spec-compliance-matrix.md?
I think this is a great idea. Listing of the version where the env var was added and then removed, and which sdks have done that, would be very helpful.
I think a combination of the blog content and a matrix is exactly what end users will need to successfully navigate this change. And once we have those docs, updating the warning section to guide end users to those docs.
I think this is a great idea. Listing of the version where the env var was added and then removed, and which sdks have done that, would be very helpful.
I think a combination of the blog content and a matrix is exactly what end users will need to successfully navigate this change. And once we have those docs, updating the warning section to guide end users to those docs.
I think that's a great start, but the real problem I see are instrumentations. SDKs have I think little to do with the support of the env variable. What really affect users are the instrumentation libraries they are using, and those won't be reflected in such matrix. It is also impracticable to list in a matrix all instrumentations in the spec/semconv repo.
Maybe what we can do is request that each SIG/contrib repo that have HTTP instrumentations to highlight such thing? Because as a developer, say working with .NET/ASP.NET I know the OTel .NET repo and if I have an overview of which version is supporting which semconv I think that's a better migration. Maybe such thing can even be part of the OTel website, under each language, like:
Mmm, I don't think this needs to be per-language. I would rather lean on release notes for the library instead, which can link to a more central migration page that explains what changed and how to approach it.
It is also impracticable to list in a matrix all instrumentations in the spec/semconv repo.
If it is impractical to present this information in a centralized matrix then we should come up with a standardized way for each instrumentation package to present this information in a clear way. I'm not sure how each instrumentation package documents themselves so I'm not sure if it is better done in repos, the otel site, or both. And no matter what I think it is still a good idea to clearly document the changes and migration strategy here and reference it.
To reiterate the importance of this problem, I spent several hours today looking through OTel js, go, java, python, ruby, and dotnet trying to answer 2 questions:
OTEL_SEMCONV_STABILITY_OPT_IN
?It was so hard to find answers to question 1. Granted I spend a majority of my time in the Collector and not the OTel languages, but I feel pretty knowledgable on most of those languages and felt I'd be able to track down answers pretty easily - I was wrong. My fear is that when users go looking for this information they will also struggle.
I see that both Java and DotNet are preparing their repos to drop support for the old HTTP semantic convention. I am very worried that if we don't solve this communication problem before that happens that we'll have a lot of upset users.
I see that both Java and DotNet are preparing their repos to drop support for the old HTTP semantic convention. I am very worried that if we don't solve this communication problem before that happens that we'll have a lot of upset users.
Java is preparing to publish a 2.x version which produces the new stable HTTP conventions, but will continue to support the 1.x version some undecided period of time. I think I heard > 6 months today. The 1.x version produces the old HTTP conventions with the ability to opt into the new ones.
cc @open-telemetry/java-instrumentation-maintainers
Java is preparing to publish a 2.x version which produces the new stable HTTP conventions, but will continue to support the 1.x version some undecided period of time.
Perfect. This is the exact type of information I want to make it sooo easy for users to discover no matter if they are using apache-httpclient
, or java-http-client
or reactor
or whatever.
I'm picking on Java because it is a trend-setter for OTel. And I don't mean to imply that the Java SIG wasn't going to share this information in the changelog/releasenotes/somewhere else. But my hope is that we solve this problem as a whole - creating a solution that will be useful for users of any language no matterthe language's current semconv version.
This is top of my mind right now so I've add this as a topic for the next semconv SIG meeting.
It is also impracticable to list in a matrix all instrumentations in the spec/semconv repo.
I think a matrix of all HTTP instrumentations published under the OpenTelemetry GitHub org could be feasible, and could be useful for our own internal tracking too (the third phase of the HTTP semconv stability WG is supposed to include driving these changes across existing implementations).
I'll explore what that might look like.
The question remains for me though where it is most likely that users would look for this information, and I tend to think that the first place they would look is the release notes for the given instrumentation library(ies) they are using.
I'll explore what that might look like.
thought on #571?
The question remains for me though where it is most likely that users would look for this information, and I tend to think that the first place they would look is the release notes for the given instrumentation library(ies) they are using.
I agree that we definitely want something in release notes (and I think how it is phrased should be standardized between languages.) But since it is a lot of information to convey, I support creating content that each changelog could link back to that details the specific changes that occurred.
thought on https://github.com/open-telemetry/semantic-conventions/pull/571?
Love it. I recognize that something like that matrix adds cognitive load to OTel maintainers, but I think it is worth it.
Looks good, but I'm just afraid of how/who will be responsible of keeping this up-to-date? Our compatibility matrix is often outdated (SIG maintainers don't always come to tick the box when they add a feature. We, in the semconv group, can't know everything that goes on in each SIG), imagine the vast amount of instrumentations. I liked the idea of having a template, that maybe is maintained by the semconv repo, but that then each instrumentation is required to add to their README somehow? I feel this way the chance of things being outdated is less.
I liked the idea of having a template, that maybe is maintained by the semconv repo, but that then each instrumentation is required to add to their README somehow?
We should definitely do this. I view this as the bare minimum we (semconv sig) should be doing for users to ensure the release notes make it clear that the breaking changes affect end users.
Possible Release note template:
The following {{library names}} now emit only the stable HTTP semantic conventions:
This breaking change updates these libraries to emit new attribute names, such as http.response.status_code
instead of http.status_code
. If you already depend on the old names of the attributes emitted by these libraries in down downstream locations, such as collector configuration, alerts, or dashboards you will be affected by this change.
See HTTP semantic conventions declared stable for full details.
Change | Comments |
---|---|
http.method → http.request.method |
Now captures only 9 common HTTP methods by default (configurable) plus _OTHER |
http.status_code → http.response.status_code |
|
http.request.header.<key> |
• Dash ("-" ) to underscore ("_" ) normalization in <key> has been removed• On HTTP server spans: now must be provided to sampler |
http.response.header.<key> |
Dash ("-" ) to underscore ("_" ) normalization in <key> has been removed |
http.request_content_length → http.request.body.size |
• Recommended → Opt-In • Not marked stable yet |
http.response_content_length → http.response.body.size |
• Recommended → Opt-In • Not marked stable yet |
user_agent.original |
• On HTTP client spans: Recommended → Opt-In • On HTTP server spans: now must be provided to sampler • See note if migrating from <= v1.18.0 |
net.protocol.name → network.protocol.name |
Recommended → Conditionally required if not http and network.protocol.version is set |
net.protocol.version → network.protocol.version |
• Examples fixed: 2.0 → 2 and 3.0 → 3 • See note if migrating from <= v1.19.0 |
net.sock.family |
Removed |
net.sock.peer.addr → network.peer.address |
On HTTP server spans: if http.client_ip was unknown, then also net.sock.peer.addr → client.address ; client.address must be provided to sampler |
net.sock.peer.port → network.peer.port |
Now captured even if same as server.port |
net.sock.peer.name |
Removed |
New: http.request.method_original |
Only captured when http.request.method is _OTHER |
New: error.type |
References:
Change | Comments |
---|---|
http.url → url.full |
|
http.resend_count → http.request.resend_count |
|
net.peer.name → server.address |
|
net.peer.port → server.port |
Now captured even when same as default port for scheme |
{.td-initial .table .table-responsive .ot-table-first-row-60}
References:
Change | Comments |
---|---|
http.route |
No change |
http.target → url.path and url.query |
Split into two separate attributes |
http.scheme → url.scheme |
Now factors in [X-Forwarded-Proto][], [Forwarded#proto][] headers |
http.client_ip → client.address |
If http.client_ip was unknown (i.e., no [X-Forwarded-For][], [Forwarded#for][] headers), then net.sock.peer.addr → client.address ; now must be provided to sampler |
net.host.name → server.address |
Now based only on Host, :authority, [X-Forwarded-Host][], [Forwarded#host][] headers |
net.host.port → server.port |
Now based only on Host, :authority, [X-Forwarded-Host][X-Forwarded-Host], [Forwarded#host][] headers |
References:
The {http.method}
portion of span names is replace by HTTP
when
{http.method}
is _OTHER
.
See note if migrating from <= v1.17.0
.
References:
Metric changes:
http.client.duration
→ http.client.request.duration
ms
→ s
Measures the duration of inbound HTTP requests.
→
Duration of HTTP server requests.
Attribute change | Comments |
---|---|
http.method → http.request.method |
Now captures only 9 common HTTP methods by default plus _OTHER |
http.status_code → http.response.status_code |
|
net.peer.name → server.address |
|
net.peer.port → server.port |
Now captured even when same as default port for scheme |
net.sock.peer.addr |
Removed |
net.protocol.name → network.protocol.name |
Recommended → Conditionally required if not http and network.protocol.version is set |
net.protocol.version → network.protocol.version |
Examples fixed: 2.0 → 2 and 3.0 → 3 ; see note if migrating from <= v1.19.0 |
New: error.type |
References:
Metric changes:
http.server.duration
→ http.server.request.duration
ms
→ s
Measures the duration of inbound HTTP requests.
→
Duration of HTTP server requests.
Attribute change | Comments |
---|---|
http.route |
No change |
http.method → http.request.method |
Now captures only 9 common HTTP methods by default plus _OTHER |
http.status_code → http.response.status_code |
|
http.scheme → url.scheme |
Now factors in [X-Forwarded-Proto span][X-Forwarded-Proto], [Forwarded#proto span][Forwarded#proto] headers |
net.protocol.name → network.protocol.name |
Recommended → Conditionally required if not http and network.protocol.version is set |
net.protocol.version → network.protocol.version |
Examples fixed: 2.0 → 2 and 3.0 → 3 ; see note if migrating from <= v1.19.0 |
net.host.name → server.address |
• Recommended → Opt-In (due to high-cardinality vulnerability since based on HTTP headers) • Now based only on Host span, :authority span, [X-Forwarded-Host span][X-Forwarded-Host], [Forwarded#host span][Forwarded#host] headers |
net.host.port → server.port |
• Recommended → Opt-In (due to high-cardinality vulnerability since based on HTTP headers) • Now based only on Host span, :authority span, [X-Forwarded-Host span][X-Forwarded-Host], [Forwarded#host span][Forwarded#host] headers |
New: error.type |
References:
[Forwarded#for]: https://developer.mozilla.org/docs/Web/HTTP/Headers/Forwarded#for [Forwarded#proto]: https://developer.mozilla.org/docs/Web/HTTP/Headers/Forwarded#proto [Forwarded#host]: https://developer.mozilla.org/docs/Web/HTTP/Headers/Forwarded#host [X-Forwarded-For]: https://developer.mozilla.org/docs/Web/HTTP/Headers/X-Forwarded-For [X-Forwarded-Proto]: https://developer.mozilla.org/docs/Web/HTTP/Headers/X-Forwarded-Proto [X-Forwarded-Host]: https://developer.mozilla.org/docs/Web/HTTP/Headers/X-Forwarded-Host
Please check out #612
We discussed the impact of the http semantic convention changes to the Operator during our SIG call. The consensus was that it is worrying haha The operator adds an extra layer of obfuscation to all the changes, which makes the release notes of the languages even more important. I'd like to continue to push for a standardized template for all languages to use. I also still push for each language to make it abundantly clear exactly which libraries are affected.
For the operator specifically we've created https://github.com/open-telemetry/opentelemetry-operator/issues/2542 to track our discussion on how we'll help our users understand the impact of these changes.
I also still push for each language to make it abundantly clear exactly which libraries are affected.
On this topic I'd very much like to revisit the idea of a matrix. We created https://docs.honeycomb.io/getting-data-in/semconv/ and would very much like to contribute it somewhere(s) meaningful.
On this topic I'd very much like to revisit the idea of a matrix. We created docs.honeycomb.io/getting-data-in/semconv and would very much like to contribute it somewhere(s) meaningful.
I think that is a great starting point to adopt/follow. But I still think this should not be placed/managed in the semconv repo, but in each SDK/Instrumentation repo.
but in each SDK/Instrumentation repo.
I thinking about this problem more and more I agree. As an end users, I should be able to look at any changelog/release and instantly comprehend that there are breaking changes that affect me. I should also be able to look at any instrumentation's documentation and instantly understand if it is emitting stable http semantic conventions or allows the use of the OTEL_SEMCONV_STABILITY_OPT_IN env variable.
I am also in favor of a centralize location that can provide a quick guide to compatibility like the matrices we published. These could link to the more specific documentation. They could also live at some top level in the repos or on each languages doc page in the otel site. Basically I want users to see this information no matter where they go investigating the language/library.
I recognize that this adds extra maintenance to the instrumentation maintainers, but I think we owe it to our end users. We need to be able to make these important breaking changes, but we also must make it clear how it impacts users.
We discussed this a bit in the TC. Regarding having every language SiG provide consistent documentation on what has been broken, the following is true:
Going forward, we're going to institute a policy where semconv WG's (and semconv itself) directly notifies language SIGs (via opened issues) on breaking changes and expectations. For HTTP, e.g., this bug would include the details on what changed in semconv (from the blog post) and would ask Language SIGs to update their instrumetnation to be in compliance. We can also call out central documentation areas where languages can contribute their migration paths. Then it would be up to language SiGs to fill out information as they perform the migration.
We do not think we can centrally own the migration, instead relying on the current Specification Change -> Language/Collector SiG notification channel, so let's improve how that communication happens.
I was looking through https://github.com/open-telemetry/semantic-conventions/tree/main/docs/http#semantic-conventions-for-http and I am worried that users looking to understand how the breaking changes of the semantic convention will not have a clear path to understanding. I don't spend a lot of time in the spec/semconv, but I spend a lot of time in the project and I feel confused about certain things - I am worried about users who are purely consumers of OTEL.
My current questions:
Ultimately I would love to see a page dedicated to outlining the migration steps. As a user I want to see exactly what values changed, what was dropped, what was added, etc. I'd also want to see which languages support the new values, and any important deadlines.
I feel the blog post was really good and I know a lot of details went into the changelog, but both of those locations aren't great for discoverability. Important details in release notes, blog posts, and changelogs get moved downward as time goes on, making them harder to find. I really want a permanent, dedicated, discoverable space for us to communicate this changes to our end users.
Hopefully any patterns we do now can help with any future semconv breaking changes.