aws / aws-xray-sdk-go

AWS X-Ray SDK for the Go programming language.
Apache License 2.0
276 stars 118 forks source link

UDP message length too long for segments send to daemon #467

Open DerkSchooltink opened 1 month ago

DerkSchooltink commented 1 month ago

I'm getting very frequent error messages such as:

[ERROR] write udp 127.0.0.1:45844->127.0.0.1:2000: write: message too long

I'm running a setup in ECS, where my backend sends Xray traces to a container running in the same task (over UDP). Apparently the segments I'm sending are too large. Although I can probably tweak what I'm sending as part of a segment on my end, I prefer not to sacrifice and context at all; it's very useful for debugging purposes.

I was looking at the way segments are being send: Emitter and segment packer

It appears that there is no consideration at all for the limitation of UDP packets' max. length of 65535 - 8 - 20 (8 for the UDP header, 20 for the IP header). I think it makes sense to make the SDK aware of this limitation.

I'm not exactly sure how I would go about this, that's why I'm not proposing any solution. Should there be an explicit error returned when this limit is exceeded? Or should the SDK trim part of the segment to make it fit within this limit? I don't know. Let's open up a conversation! :)

jj22ee commented 1 month ago

Similar issue in Java SDK - https://github.com/aws/aws-xray-sdk-java/issues/395 Trimming the segment data intelligently would be a technical challenge. The XRay SDK is better to not throw an error if UDP size limit is exceeded to prevent the SDKs from affecting the runtime of the main codebase.

As a workaround in Golang XRay SDK (not particularly a satisfactory workaround), you can create a new Streaming Strategy to limit the number of subsegments per UDP call: https://github.com/aws/aws-xray-sdk-go/blob/68a970b9b65be137c11ad292729761c8b557e7ab/xray/default_streaming_strategy.go#L37-L43

You can set Streaming Strategy through Config here: https://github.com/aws/aws-xray-sdk-go/blob/master/xray/config.go#L187

If you are able to look into a non-XRay SDK workaround, you can checkout the OpenTelemetry solution at https://opentelemetry.io/, where the SDK and Collector uses http/grpc instead of UDP.