vectordotdev / vector

A high-performance observability data pipeline.
https://vector.dev
Mozilla Public License 2.0
17.83k stars 1.58k forks source link

`datadog_agent` incorrectly casts `u64` values from the Agent as `i64` in order to fit into the `Value` enum. #14687

Open neuronull opened 2 years ago

neuronull commented 2 years ago

A note for the community

Problem

In the datadog_agent source, when decoding incoming trace payloads we are casting u64 values to i64 values. This results in overflow and data inconsistency.

Luckily, we are not doing math with the affected fields on the datadog_traces sink side.

However- the result is that the payloads we send out from the datadog_traces sink contains incorrect values for the parent_id , trace_id and span_id , if the incoming data is > than the max that fits in an i64.

This was observed while analyzing trace payloads for an enterprise user.

Internally we discussed a few options:

These are the locations in the code where the incorrect casting is occurring:

https://github.com/vectordotdev/vector/blob/05f57e49e95bf1be6e5c2ade21633bdbe5464898/src/sources/datadog_agent/traces.rs#L273-L275

https://github.com/vectordotdev/vector/blob/05f57e49e95bf1be6e5c2ade21633bdbe5464898/src/sources/datadog_agent/traces.rs#L217

Configuration

[sources.source0]
type = "datadog_agent"
address = "0.0.0.0:8181"
multiple_outputs = true
store_api_key = false

[sinks.sink0]
inputs = [ "source0.traces" ]
type = "datadog_traces"
default_api_key="${TEST_DATADOG_API_KEY}"
compression = "none"

Version

0.25.0

Debug Output

No response

Example Data

No response

Additional Context

No response

References

No response

jszwedko commented 1 year ago

Noting that we may also need to support 128-bit trace ids to support OTel (both directly and sent through the Datadog Agent).

spencergilbert commented 1 year ago

Noting that we may also need to support 128-bit trace ids to support OTel (both directly and sent through the Datadog Agent).

Should we open a separate issue for the otel side of things?

jszwedko commented 1 year ago

Noting that we may also need to support 128-bit trace ids to support OTel (both directly and sent through the Datadog Agent).

Should we open a separate issue for the otel side of things?

I figured we'd run into it when adding trace support for the opentelemetry source so don't need to track it separately just yet.