opentelemetry-beam / opentelemetry_exporter

Exporter supporting the OpenTelemetry Protocol (OTLP)
Apache License 2.0
1 stars 1 forks source link

Spans missing attribute values #12

Closed denic closed 4 years ago

denic commented 4 years ago

Hi,

I try to use opentelemetry_exporter from an Elixir application. I use the opentelemetry-collector to receive the traces and export it to Zipkin and Jaeger (see Attachments). For my test setup I added instrumentation to Plug and create a trace as follows:

...
    OpenTelemetry.Tracer.start_span(span_name, %{})

    OpenTelemetry.Span.set_attribute("xxx.num", 23)
    OpenTelemetry.Span.set_attribute("xxx.string", "XXXXXXXXXXXXXXX")
...

The problem now is, that only the attributes keys (xxx.num, xxx.string) are visible in Zipkin/Jaeger, not the values.

I added a debug statement right before the protobuf conversion and after it (in https://github.com/opentelemetry-beam/opentelemetry_exporter/blob/master/src/opentelemetry_exporter.erl#L44).

I use opentelemetry_exporter_trace_service_pb:decode_msg(Proto, export_trace_service_request) to decode. It results in the following snippets.

"_____ ENCODE"
#{resource_spans=>[#{instrumentation_library_spans=>[#{instrumentation_library=>#{name=><<"otel_test">>,version=><<"0.1.0">>},spans=>[#{attributes=>[#{int_value=>200,key=><<"http_status">>,type=>'INT'},#{key=><<"xxx.string">>,string_value=><<"XXXXXXXXXXXXXXX">>,type=>'STRING'},#{int_value=>23,key=><<"xxx.num">>,type=>'INT'}],dropped_attributes_count=>0,dropped_events_count=>0,dropped_links_count=>0,end_time_unix_nano=>1598517772487994154,events=>[],kind=>'SPAN_KIND_UNSPECIFIED',links=>[],local_child_span_count=>undefined,name=><<"/">>,parent_span_id=><<>>,span_id=><<82,170,45,138,130,251,5,102>>,start_time_unix_nano=>1598517771495100168,status=>#{},trace_id=><<140,220,44,235,108,35,89,207,135,175,94,14,227,86,43,164>>,trace_state=>[]},#{attributes=>[],dropped_attributes_count=>0,dropped_events_count=>0,dropped_links_count=>0,end_time_unix_nano=>1598517772487685932,events=>[],kind=>'SPAN_KIND_UNSPECIFIED',links=>[],local_child_span_count=>undefined,name=><<"page_controller.index">>,parent_span_id=><<82,170,45,138,130,251,5,102>>,span_id=><<159,170,200,167,198,10,242,11>>,start_time_unix_nano=>1598517771495181676,status=>#{},trace_id=><<140,220,44,235,108,35,89,207,135,175,94,14,227,86,43,164>>,trace_state=>[]},#{attributes=>[#{int_value=>200,key=><<"http_status">>,type=>'INT'},#{key=><<"xxx.string">>,string_value=><<"XXXXXXXXXXXXXXX">>,type=>'STRING'},#{int_value=>23,key=><<"xxx.num">>,type=>'INT'}],dropped_attributes_count=>0,dropped_events_count=>0,dropped_links_count=>0,end_time_unix_nano=>1598517769947794280,events=>[],kind=>'SPAN_KIND_UNSPECIFIED',links=>[],local_child_span_count=>undefined,name=><<"/">>,parent_span_id=><<>>,span_id=><<213,82,114,218,31,200,178,147>>,start_time_unix_nano=>1598517769880872698,status=>#{},trace_id=><<31,210,236,12,245,185,238,67,66,104,105,112,214,22,146,25>>,trace_state=>[]},#{attributes=>[],dropped_attributes_count=>0,dropped_events_count=>0,dropped_links_count=>0,end_time_unix_nano=>1598517769916669126,events=>[],kind=>'SPAN_KIND_UNSPECIFIED',links=>[],local_child_span_count=>undefined,name=><<"page_controller.index">>,parent_span_id=><<213,82,114,218,31,200,178,147>>,span_id=><<26,196,156,12,250,73,32,144>>,start_time_unix_nano=>1598517769886148847,status=>#{},trace_id=><<31,210,236,12,245,185,238,67,66,104,105,112,214,22,146,25>>,trace_state=>[]}]}],resource=>#{attributes=>[],dropped_attributes_count=>0}}]}
"_____ ENCODE END"

"_____ DECODE"
#{resource_spans=>[#{instrumentation_library_spans=>[#{instrumentation_library=>#{name=><<"otel_test">>,version=><<"0.1.0">>},spans=>[#{attributes=>[#{key=><<"http_status">>},#{key=><<"xxx.string">>},#{key=><<"xxx.num">>}],dropped_attributes_count=>0,dropped_events_count=>0,dropped_links_count=>0,end_time_unix_nano=>1598517772487994154,events=>[],kind=>'SPAN_KIND_UNSPECIFIED',links=>[],name=><<"/">>,parent_span_id=><<>>,span_id=><<82,170,45,138,130,251,5,102>>,start_time_unix_nano=>1598517771495100168,status=>#{code=>'Ok',message=><<>>},trace_id=><<140,220,44,235,108,35,89,207,135,175,94,14,227,86,43,164>>,trace_state=><<>>},#{attributes=>[],dropped_attributes_count=>0,dropped_events_count=>0,dropped_links_count=>0,end_time_unix_nano=>1598517772487685932,events=>[],kind=>'SPAN_KIND_UNSPECIFIED',links=>[],name=><<"page_controller.index">>,parent_span_id=><<82,170,45,138,130,251,5,102>>,span_id=><<159,170,200,167,198,10,242,11>>,start_time_unix_nano=>1598517771495181676,status=>#{code=>'Ok',message=><<>>},trace_id=><<140,220,44,235,108,35,89,207,135,175,94,14,227,86,43,164>>,trace_state=><<>>},#{attributes=>[#{key=><<"http_status">>},#{key=><<"xxx.string">>},#{key=><<"xxx.num">>}],dropped_attributes_count=>0,dropped_events_count=>0,dropped_links_count=>0,end_time_unix_nano=>1598517769947794280,events=>[],kind=>'SPAN_KIND_UNSPECIFIED',links=>[],name=><<"/">>,parent_span_id=><<>>,span_id=><<213,82,114,218,31,200,178,147>>,start_time_unix_nano=>1598517769880872698,status=>#{code=>'Ok',message=><<>>},trace_id=><<31,210,236,12,245,185,238,67,66,104,105,112,214,22,146,25>>,trace_state=><<>>},#{attributes=>[],dropped_attributes_count=>0,dropped_events_count=>0,dropped_links_count=>0,end_time_unix_nano=>1598517769916669126,events=>[],kind=>'SPAN_KIND_UNSPECIFIED',links=>[],name=><<"page_controller.index">>,parent_span_id=><<213,82,114,218,31,200,178,147>>,span_id=><<26,196,156,12,250,73,32,144>>,start_time_unix_nano=>1598517769886148847,status=>#{code=>'Ok',message=><<>>},trace_id=><<31,210,236,12,245,185,238,67,66,104,105,112,214,22,146,25>>,trace_state=><<>>}]}],resource=>#{attributes=>[],dropped_attributes_count=>0}}]}
"_____ DECODE PROTO"

To me it looks like the values are somehow lost in the conversion.

Any hint to what I am doing wrong is much appreciated! Thanks.

PS: Besides that also using gprc protocol does not work, but that maybe a story for another issue. :)

Atachments

opentelemetry-collector config

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: "localhost:55680"
      http:
        endpoint: "localhost:55681"

processors:
  batch:
    send_batch_size: 1024
    timeout: 5s

exporters:
  logging:
    loglevel: debug
  zipkin:
    endpoint: "http://localhost:9411/api/v2/spans"
  # jaeger_thrift_http:
  #   url: "http://localhost:14268/api/traces"
  jaeger:
    endpoint: "localhost:14250"
    insecure: true

extensions:
  zpages: {}

service:
  extensions: [zpages]
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch]
      # exporters: [zipkin, jaeger_thrift_http]
      exporters: [zipkin, jaeger]
tsloughter commented 4 years ago

Thanks. I bet the protos changed again :(. They aren't yet considered stable so they keep changing. It should go GA in September and then this issue won't keep coming up.

I'll verify and upgrade the protos later this morning and let you know.

tsloughter commented 4 years ago

Are you using the hex package? Have you tried using the git repo? I think there might be updates not released yet.

The changes I found in the protos is unrelated to the attributes, but I am fixing it up and will have a PR, but I think you might see it working if you simply switch to the git repo until we have another release made, which can be soon.

bryannaegele commented 4 years ago

I don't believe we've cut a release in a while so that looks like the issue with the old protos. https://github.com/opentelemetry-beam/opentelemetry_exporter/commit/c2e495065620cc6c06c57786b8db724b02367f27 is stable and should correct the issue you're seeing.

denic commented 4 years ago

Are you using the hex package?

Jepp :) Switched to using the git repo. Now everything looks fine. Thanks guys!

And thanks for the work you put into this project.

tsloughter commented 4 years ago

@denic ok, good to hear. Will get a release out after I get a new PR in with the latest updates to the proto and we'll have to try to keep updating the package more often :). Hopefully GA comes soon and there won't be so many changes.

bryannaegele commented 4 years ago

@tsloughter they haven't cut a new release of the protos yet, so I think we should hold off on that. The current commit we're at is the last release (0.4.0).

tsloughter commented 4 years ago

Eh, if the collector keeps moving we should keep updating to keep up. I guess if the collector only has specific releases.

bryannaegele commented 4 years ago

The collector is tied to the protos release. It isn't tied to the nightly afaik. There are breaking changes in the unreleased protos that we're going to have to address across the main API and SDK, as well, like this one https://github.com/open-telemetry/opentelemetry-proto/commit/ca6dcbbf390d8b3ce419a931eb351d189a3eb4fa.

Maybe we just set up a branch to work against with the latest protos and be ready to release when they do. We should also probably add a compatibility matrix in the README.