signalfx / splunk-otel-java

Splunk Distribution of OpenTelemetry Java
https://docs.splunk.com/Observability/gdi/get-data-in/application/java/get-started.html
Apache License 2.0
63 stars 33 forks source link

ZipkinSpanExporter - Failed to export spans #840

Closed pkemp-bh closed 1 year ago

pkemp-bh commented 2 years ago

Snippet of k8s config for the collector:

image:
      otelcol:
        repository: quay.io/signalfx/splunk-otel-collector
        tag: 0.35.0 #version
        pullPolicy: IfNotPresent

    receivers:
      sapm:
        endpoint: 0.0.0.0:7276
      smartagent/docker-container-stats:
        dockerURL: unix:///var/run/docker.sock
        excludedImages:
        - '*pause-amd64*'
        - k8s.gcr.io/pause*
        labelsToDimensions:
          io.kubernetes.container.name: container_spec_name
          io.kubernetes.pod.name: kubernetes_pod_name
          io.kubernetes.pod.namespace: kubernetes_namespace
          io.kubernetes.pod.uid: kubernetes_pod_uid
        type: docker-container-stats
      smartagent/kubelet-metrics:
        kubeletAPI:
          authType: serviceAccount
          url: https://localhost:10250
        type: kubelet-metrics
      smartagent/signalfx-forwarder:
        defaultSpanTags:
          environment: nonprod 
        listenAddress: 0.0.0.0:9080
        type: signalfx-forwarder

    exporters:
      sapm:
        access_token: ...
        access_token_passthrough: true
        endpoint: https://ingest.us1.signalfx.com/v2/trace
        max_connections: 100
        num_workers: 8

    processors:
      batch:
      batch/2:
        #send_batch_size: 10000
        timeout: 10s
        send_batch_max_size: 10000 #(default = 0): The upper limit of the batch size. 0 means no upper limit of the batch size. This property ensures that larger batches are split into smaller units. It must be greater or equal to send_batch_size.

Debug logs: otel_logs.txt

mateuszrzeszutek commented 2 years ago

Hey @pkemp-bh ,

Have you tried configuring the collector to use a zipkin receiver instead of the signalfx-forwarder? Also, the collector image you're running is 10 months old; have you tried using a more recent version? If so, does the problem still persist?

If it is possible for you, I'd also strongly recommend using OTLP as the trace protocol - Zipkin as a protocol has lots of downsides, e.g. you're losing resource data, span links, and the span events are severely limited (no attributes).

breedx-splk commented 2 years ago

Hi @pkemp-bh. Have you had a chance to look at the question asked on Jul 11? We will close this issue as stale if we don't get a response. Thanks.

mateuszrzeszutek commented 1 year ago

Hey @pkemp-bh , Please reopen this issue or create a new one if you're still experiencing this problem.