Closed hiddewie closed 8 months ago
CC @cgilmour
A fix for this was made by my colleague @dgoffredo in #29932, and looks like it has been backported as well.
Thank you for the quick answer, I was not aware of that pull request (somehow it evaded my Github searches). The PR seems to cover my issue, I will wait for the release including the fix. This issue can be closed.
A release including the fix shipped very recently (literally - within the last few hours). It's included in both the 1.27.x and 1.26.x releases.
I can confirm that Envoy 1.27.2 changes the operation name back from ingress
to envoy.proxy
. However the resource name is not taken from the tag resource.name
like it was, and is still ingress
.
The reproduction Nomad job above, but using the image envoyproxy/envoy:v1.27.2
produces the following traces:
So my main question remains: how can I influence the generated Datadog resource name from the tags
configuration in the Envoy tracing configuration so it is not the fixed string ingress
?
Thanks @hiddewie, you're right it corrects only the operation name and not an additional behavior of setting some details via custom_tags
. Me and @dgoffredo will take a look at this.
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.
Backports in #30892 & #30893
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.
not stale
Heads up: I found another way that Datadog spans can have the wrong "operation name": https://github.com/envoyproxy/envoy/pull/31366
If Envoy cuts a release before that pull request is backported, then most of the situations where "operation name" is incorrect will be fixed, except for one:
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.
not stale
@hiddewie does this have a corresponding pr/commit on main
- generally any changes should be landed on main and backported - not sure of the context here tho
The fix for this is available in the recently released v1.29.0. That release also contains fixes for other Datadog bugs that were introduced in v1.27.
ok - so iiuc this needs to be backported to ~1.26, and~ 1.28 also ?
if there is anything else that still needs backporting please lmk
All of the currently known Datadog bugs that were introduced with v1.27 have been patched on all of: the main branch, the v1.27 release branch, and the v1.28 release branch.
can we close this?
I'm happy to close this, but let's wait for confirmation from @hiddewie that upgrading to v1.29 is a solution.
I will do a test-deployment with envoy 1.29 and see how the result is in datadog.
I can confirm the 1.29.0 provides equivalent functionality in Datadog as the 1.26.3 version that we were previously running on. This solves this issue from my point of view.
Thanks for all the help!
@hiddewie @phlax should we backport those fixes?
@zirain my understanding from https://github.com/envoyproxy/envoy/issues/30235#issuecomment-1896593037 is that its not necessary
let me check.
@zirain my understanding from #30235 (comment) is that its not necessary
There were three ways that the operation name could be set to an unexpected value. Each fix was part of its own pull request to main, which is now part of the v1.29 release, and each was backported onto both the v1.27 and v1.28 release branches.
PRs:
Span::setOperation
(as part of an unrelated fix):
Span::spawnChild
:
@dgoffredo thanks for your reply, make sense to me.
Title
Envoy 1.27: Datadog operation changed from
envoy.proxy
toingress
and resource name unsetDescription
Since Envoy version 1.27, the Datadog traces are emitted with a different operation ID:
ingress
instead ofenvoy.proxy
. In addition, the resource names set by the tracing tagresource.name
is no longer propagated to Datadog.Seems related to the discussion in envoyproxy/envoy#21083, the PR envoyproxy/envoy#26284 which replaced the Datadog tracing implementation with another library.
Envoy versions 1.26.3 or earlier are fine, and produce the correct traces with the correct operation IDs, resources and tags.
Observed behaviour: The Datadog operation name is
ingress
, and the resource name isingress
Expected behaviour: The Datadog operation name is
envoy.proxy
, and the resource name is as configured by the tracing tagresource.name
The release notes of 1.27 contain no deprecations, nor configuration updates that are needed to preserve the 1.26.x behaviour: https://www.envoyproxy.io/docs/envoy/latest/version_history/v1.27/v1.27.0.
Repro steps
Nomad job (contains the Envoy configuration)
In the container:
With container
envoyproxy/envoy:v1.26-latest
:With container
envoyproxy/envoy:v1.27-latest
:Admin and Stats Output
See attached files:
stats.txt server_info.txt routes.txt clusters.txt
Config
See content of Nomad job
Logs
Access logs for 1.27