aws / aws-app-mesh-roadmap

AWS App Mesh is a service mesh that you can use with your microservices to manage service to service communication
Apache License 2.0
347 stars 25 forks source link

Bug: X-Ray traces missing "Remote" segments in Envoy 1.15+ #282

Closed lavignes closed 4 years ago

lavignes commented 4 years ago

SECURITY NOTICE: If you think you’ve found a potential security issue, please do not post it in the Issues. Instead, please follow the instructions here or email AWS security directly.

Summary Envoy prior to version 1.15 would emit X-Ray trace segments to "remote" upstreams even when the underlying application Envoy is proxying was not instrumented with an X-Ray SDK. These appear in the X-Ray service map as a node called "remote". These "remote" nodes appear to represent all non-modeled upstream endpoints from the perspective of the Envoy, including the application it is proxying. Though that may not actually be true. Needs confirmation.

screen_shot_2020-10-27_at_5 31 46_pm

Though it is not entirely clear whether this was a bug in the first place, it is definitely a behavioral regression that needs to be addressed.

Steps to Reproduce The NodeJS example app in the App Mesh workshop demonstrates this behavior well: https://www.appmeshworkshop.com/x-ray/xray_nodejs/ as it is not instrumented with the X-Ray SDK yet.

Are you currently working around this issue?

lavignes commented 4 years ago

This may actually be a bug in older versions of Envoy where the "remote" node is actually just a duplicate subsegment of requests that reach the proxy. If that is the case, then the service map here simply seems like it is showing the proxied application, but instead is showing redundant traces to the virtual node under the "remote" namespace.

Need to confirm this.

lavignes commented 4 years ago

I've confirmed that this change was intentional and was signed-off by the X-Ray team. This change in behavior was not well-communicated however.

The old behavior was misleading since it was indeed duplicate data being emitted for the same trace segment. So I'm going to resolve this for now.

I do not believe this breaks anything other than the expectation of what the service map presents to the user when not instrumenting their application.

Subsegments for the service being proxied by Envoy should be generated as long as the application is using the X-Ray SDK.

If anyone feels strongly that this behavior should be retained, feel free to re-open.