apollographql / router

A configurable, high-performance routing runtime for Apollo Federation 🚀
https://www.apollographql.com/docs/router/
Other
783 stars 254 forks source link

Namespace not set when sending *directly* to Datadog #1162

Open krisztian-sala opened 2 years ago

krisztian-sala commented 2 years ago

I have set the Datadog namespace both with the DD_ENV environment variable (which works for other services) and also from the router configuration like so:

...
      tracing:
        trace_config:
          service_name: "${DD_SERVICE:apollo-router}"
          service_namespace: "${DD_ENV:my-namespace}"

But the namespace is still set as none

How can I set it correctly?

garypen commented 2 years ago

You router configuration is correct. In your various config elements your service_namespace will take the value of ${DD_ENV} or, if not set, "my-namespace".

I modified the source and made a debug build of the router to print out the config used to build the router trace information and confirmed it was setting service_namespace:

2022-05-26T12:46:45.100927Z  INFO apollo_router::plugins::telemetry::config: trace_config: Config { sampler: ParentBased(AlwaysOn), id_generator: IdGenerator { _private: () }, span_limits: SpanLimits { max_events_per_span: 128, max_attributes_per_span: 128, max_links_per_span: 128, max_attributes_per_event: 128, max_attributes_per_link: 128 }, resource: Some(Resource { attrs: {Key("process.executable.name"): String("tester"), Key("service.name"): String("apollo-router"), Key("service.namespace"): String("my-namespace")} }) }

(If you scroll right far enough above, you can see service_namespace is set to "my-namespace".)

I'm curious how you know the service_name is set to none in your case. Are you running your own build?

krisztian-sala commented 2 years ago

I think I explained it wrong. So I don't actually know what the service_namespace variable's value is, I'm using the v0.9.2 version, so no custom build. The router is running in Kubernetes and what I'm trying to achieve is to set the Kubernetes namespace that the router is part of as the env variable in Datadog.

Currently, whatever I do, the env is none image

I thought that the service_namespace sets that value. But I also tried by setting a custom trace attribute like so:

telemetry:
  tracing:
    trace_config:
      service_name: "${DD_SERVICE:apollo-router}"
      service_namespace: "${DD_ENV:my-namespace}"
      sampler: 1
      attributes: 
        env: "${DD_ENV:my-namespace}"
bnjjj commented 2 years ago

Apparently we're using service_namespace and pass it to attributes named with service.namespace (https://github.com/apollographql/router/blob/main/apollo-router/src/plugins/telemetry/config.rs#L224) . Therefore on your traces you should have an attribute named service.namespace with the value you're looking for. Let me know if it helps

krisztian-sala commented 2 years ago

Unfortunately I don't see that attribute or any other that I set in telemetry.tracing.trace_config.attributes appear in my trace. The only attributes that show up in Datadog are these:

Screenshot 2022-05-30 at 11-51-17 APM Traces Datadog

krisztian-sala commented 2 years ago

Any news about this? Not having proper monitoring is a blocker for us to go to production with the router.

It doesn't send any custom attribute to Datadog. But the most important would be the env, which is currently none in all cases.

BrynCooke commented 2 years ago

This is something that we need to implement. The existing datadog opentelemetry implementation is only understands service.name out of the box.

It doesn't help that what is currently called attributes in our config yaml should probably be called resource as it pertains to this: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/resource/semantic_conventions/README.md

To move forward on this I suggest the following:

telemetry:
  tracing:
    common:
      attributes:
        static:
          - name: "version"
            value: "v1.0.0"
        request:
          - header:
               named: "content-type"
               rename: "payload_type"
               default: "application/json"
          - header:
              named: "x-custom-header-to-add"
        response:
          - body:
              path: errors.extensions.status
              name: extended_status
              default: optional_default_value
        context:
          - named: "foo"
            default: "application/json"

@bnjjj If we can come to an agreement on yaml format that sits nicely for both metrics and tracing then we can probably make this happen quickly. Let's discuss.

BrynCooke commented 2 years ago

We've settled on the following which should be consistent with the rest of our config:

telemetry:
  tracing: 
    common:
      resource:      
      attributes:
        router:
          static:
            - name: "version"
              value: "v1.0.0"
          request:
            - header:
                named: "content-type"
                rename: "payload_type"
                default: "application/json"
            - header:
                named: "x-custom-header-to-add"
          response:
            - body:
                path: errors.extensions.status
                name: extended_status
                default: optional_default_value
          context:
            - named: "foo"

        subgraph:
          all:
            static:
              - name: "version"
                value: "v1.0.0"
              request:
                - header:
                    named: "content-type"
                    rename: "payload_type"
                    default: "application/json"
                - header:
                    named: "x-custom-header-to-add"
              response:
                - body:
                    path: errors.extensions.status
                    name: extended_status
                    default: optional_default_value
              context:
                - named: "foo"
          subgraphs:
            products:
              static:
                - name: "version"
                  value: "v1.0.0"
                request:
                  - header:
                      named: "content-type"
                      rename: "payload_type"
                      default: "application/json"
                  - header:
                      named: "x-custom-header-to-add"
                response:
                  - body:
                      path: errors.extensions.status
                      name: extended_status
                      default: optional_default_value
                context:
                  - named: "foo"
bnjjj commented 2 years ago

related to #1270

bnjjj commented 1 year ago

This PR is currently blocked by https://github.com/open-telemetry/opentelemetry-rust/issues/876

abernix commented 1 year ago

@bnjjj Is this expected to be fixed via #1948?

bnjjj commented 1 year ago

Unfortunately no, we need a new release (0.19.0 or 0.18.1)

krisztiansala commented 1 year ago

Any news on this?

BrynCooke commented 1 year ago

@krisztiansala Unfortunately we are still waiting on an Otel release.

Are you using a custom router build? If so then you can use the latest router and use the following patch to have this working:

[patch.crates-io]
 opentelemetry = { git = "https://github.com/open-telemetry/opentelemetry-rust.git", rev = "e5ef3552efab2bdbf2f838023c37461cd799ab2c"}
 opentelemetry-http = { git = "https://github.com/open-telemetry/opentelemetry-rust.git", rev = "e5ef3552efab2bdbf2f838023c37461cd799ab2c"}
 opentelemetry-jaeger = { git = "https://github.com/open-telemetry/opentelemetry-rust.git", rev = "e5ef3552efab2bdbf2f838023c37461cd799ab2c"}

We'll also look at doing something for our binary builds shortly.

krisztiansala commented 1 year ago

I've tried this, but it didn't work for me - everything is still the same. I will wait for the release, maybe it will be different.

abernix commented 1 year ago

This is now waiting on https://github.com/open-telemetry/opentelemetry-rust/pull/965, which we're hoping comes soon. (Suggested to be this weekend!)

abernix commented 1 year ago

The fix for this is purportedly is now in https://github.com/open-telemetry/opentelemetry-rust/releases/tag/v0.19.0. I'll see about getting the integration of that upgrade prioritized on our side.

abernix commented 1 year ago

As an update, we are still waiting on an upstream dependency (one which depends on opentelemetry-rust, itself) to actually move this along — https://github.com/tokio-rs/tracing-opentelemetry/pull/9.

garypen commented 1 year ago

I spent a little time investigating this as part of the update to opentelemetry 0.19.0. There isn't a specific method for capturing service_namespace() in the opentelemetry-datadog library. I tried adding some code to manually copy the setting has an attribute and this didn't work.

The only workaround that I could concoct was to explicitly add configuration for service_namespace as an attribute in trace_config, which is clearly undesirable. We'll have to leave this open for now and return to it after 0.19.0 is merged.

BrynCooke commented 9 months ago

Let's submit a PR to otel.