vectordotdev / vector

A high-performance observability data pipeline.
https://vector.dev
Mozilla Public License 2.0
17.46k stars 1.53k forks source link

[opentelemetry source] Incorrect `message` when `body` is an `ArrayValue` #13926

Open spencergilbert opened 2 years ago

spencergilbert commented 2 years ago

When running the otel-col and vector locally with the following configs...

opentelemetry-collector:

receivers:
  otlp:
    protocols:
      http:
        endpoint: localhost:9876

exporters:
  logging:
    loglevel: debug
  otlp:
    endpoint: localhost:4317
    tls:
      insecure: true
  otlphttp:
    endpoint: http://localhost:4318
    tls:
      insecure: true

service:
  pipelines:
    logs:
      receivers: [otlp]
      exporters: [logging,otlp,otlphttp]

vector:

data_dir = "/var/lib/vector/"

[api]
enabled = true

[sources.otel]
type = "opentelemetry"

[sources.otel.acknowledgements]

[sources.otel.grpc]
address = "0.0.0.0:4317"

[sources.otel.http]
address = "0.0.0.0:4318"

[sources.internal_metrics]
type = "internal_metrics"

[sinks.console]
inputs = ["otel.logs"]
target = "stdout"
type = "console"

[sinks.console.encoding]
codec = "json"

[sinks.console.healthcheck]
enabled = true

[sinks.console.buffer]
type = "memory"
max_events = 500
when_full = "block"

[sinks.metrics]
type = "prometheus_exporter"
inputs = [ "internal_metrics" ]
address = "0.0.0.0:9090"

Curling this input to the collector results in dropped values from the body, this behavior is seen both in Vector and the logging exporter from the otel-collector.

Input:

{
  "resource_logs": [
    {
      "scope_logs": [
        {
          "log_records": [
            {
              "body": {
                "array_value": {
                  "values": [
                    {
                      "string_value": "string",
                      "int_value": 42,
                      "bool_value": true
                    }
                  ]
                }
              }
            }
          ]
        }
      ]
    }
  ]
}

Output:

{
  "dropped_attributes_count": 0,
  "message": [42],
  "observed_timestamp": "2022-08-10T20:06:01.143133Z",
  "timestamp": "2022-08-10T20:06:01.143133Z"
}
mx-psi commented 2 years ago

Nothing conclusive, but in case it helps, I wrote a small program using pdata (the module the Collector uses)


package main

import (
    "encoding/json"
    "fmt"

    "go.opentelemetry.io/collector/pdata/pcommon"
    "go.opentelemetry.io/collector/pdata/plog/plogotlp"
)

func main() {
    arrayVal := pcommon.NewValueSlice()
    slice := pcommon.NewSliceFromRaw([]any{"string", 42, true})
    slice.CopyTo(arrayVal.SliceVal())

    request := plogotlp.NewRequest()
    arrayVal.CopyTo(
        request.Logs().ResourceLogs().AppendEmpty().ScopeLogs().AppendEmpty().LogRecords().AppendEmpty().Body())

    bytes, err := json.MarshalIndent(request, "", "    ")
    if err != nil {
        panic(err)
    }
    fmt.Println(string(bytes))
}

It prints

{
    "resourceLogs": [
        {
            "resource": {},
            "scopeLogs": [
                {
                    "scope": {},
                    "logRecords": [
                        {
                            "body": {
                                "arrayValue": {
                                    "values": [
                                        {
                                            "stringValue": "string"
                                        },
                                        {
                                            "intValue": "42"
                                        },
                                        {
                                            "boolValue": true
                                        }
                                    ]
                                }
                            },
                            "traceId": "",
                            "spanId": ""
                        }
                    ]
                }
            ]
        }
    ]
}

Reading the spec I am not sure if both camel case and snake case should be supported (spec just says 'whatever proto3 does'). There are other differences (resource, traceID, spanID), not sure if those are relevant either