kubernetes / ingress-nginx

Ingress NGINX Controller for Kubernetes
https://kubernetes.github.io/ingress-nginx/
Apache License 2.0
17.53k stars 8.26k forks source link

OpenTelemetry traceparent header is not set when using Grpc backend #10319

Open EraYaN opened 1 year ago

EraYaN commented 1 year ago

What happened:

OpenTelemetry traceparent header is not set when using Grpc backend. It's just missing. Same backend will set this for HTTP/1.1 requests. Below I included a test h2c service that echos the full request.

EDIT: also if I add a traceparent header from the outside it also just passes it on even if the trust is disabled, but maybe that is by design.

What you expected to happen:

It should have been included just like for HTTP/1.1 requests.

versions and env **NGINX Ingress controller version**: v1.8.1 Build: dc88dce9ea5e700f3301d16f971fa17c6cfe757d Repository: https://github.com/kubernetes/ingress-nginx nginx version: nginx/1.21.6 **Kubernetes version** (use `kubectl version`): Server Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.1", GitCommit:"ec73e42cca0cf369574e1cdaaff35401083080d8", GitTreeState:"clean", BuildDate:"2023-06-12T18:43:37Z", GoVersion:"go1.20.3", Compiler:"gc", Platform:"linux/amd64"} **Environment**: Azure Kubernetes Service, calico for networking and Azure RBAC - **Cloud provider or hardware configuration**: Azure - **OS** (e.g. from /etc/os-release): Alpine Linux v3.18 - **Kernel** (e.g. `uname -a`): Linux ingress-nginx-controller-77f5cfbf84-g8rtf 5.15.0-1041-azure #48-Ubuntu SMP Tue Jun 20 20:34:08 UTC 2023 x86_64 Linux - **Install tools**: AKS, deployed using terraform - **Basic cluster related info**:
nodes ``` NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME aks-agentpool-20658013-vmss000000 Ready agent 22d v1.27.1 10.8.1.246 Ubuntu 22.04.2 LTS 5.15.0-1041-azure containerd://1.7.1+azure-1 aks-agentpool-20658013-vmss000001 Ready agent 22d v1.27.1 10.8.0.253 Ubuntu 22.04.2 LTS 5.15.0-1041-azure containerd://1.7.1+azure-1 aks-agentpool-20658013-vmss000002 Ready agent 22d v1.27.1 10.8.0.4 Ubuntu 22.04.2 LTS 5.15.0-1041-azure containerd://1.7.1+azure-1 aks-agentpool-20658013-vmss000003 Ready agent 7d1h v1.27.1 10.8.3.85 Ubuntu 22.04.2 LTS 5.15.0-1041-azure containerd://1.7.1+azure-1 akswin1000000 Ready agent 22d v1.27.1 10.8.2.239 Windows Server 2022 Datacenter 10.0.20348.1850 containerd://1.6.21+azure akswin1000001 Ready agent 22d v1.27.1 10.8.3.34 Windows Server 2022 Datacenter 10.0.20348.1850 containerd://1.6.21+azure ```
helm values ```yaml USER-SUPPLIED VALUES: controller: admissionWebhooks: enabled: true patch: enabled: true config: enable-brotli: true enable-ocsp: true enable-opentelemetry: "true" limit-conn-status-code: 503 limit-req-status-code: 429 opentelemetry-operation-name: HTTP $request_method $service_name $uri opentelemetry-trust-incoming-span: "false" otel-sampler: AlwaysOn otel-sampler-parent-based: "false" otel-sampler-ratio: "1.0" otel-service-name: ingress-nginx otlp-collector-host: tempo.monitoring.svc otlp-collector-port: "4317" ssl-dh-param: ingress-nginx/ingress-dh-params use-forwarded-headers: "false" use-gzip: true metrics: enabled: true serviceMonitor: enabled: true nodeSelector: kubernetes.io/os: linux opentelemetry: enabled: true podLabels: log-by-promtail: "true" replicaCount: 3 service: externalTrafficPolicy: Local sessionAffinity: ClientIP ```
state - **Current State of the controller**: ``` Name: nginx Labels: app.kubernetes.io/component=controller app.kubernetes.io/instance=ingress-nginx app.kubernetes.io/managed-by=Helm app.kubernetes.io/name=ingress-nginx app.kubernetes.io/part-of=ingress-nginx app.kubernetes.io/version=1.8.1 helm.sh/chart=ingress-nginx-4.7.1 Annotations: meta.helm.sh/release-name: ingress-nginx meta.helm.sh/release-namespace: ingress-nginx Controller: k8s.io/ingress-nginx Events: ```
ingress spec ```yaml apiVersion: networking.k8s.io/v1 kind: Ingress metadata: annotations: nginx.ingress.kubernetes.io/backend-protocol: GRPC # THIS IS THE PROBLEM nginx.ingress.kubernetes.io/server-snippet: |2 client_body_timeout "1200s"; grpc_send_timeout "1200s"; grpc_read_timeout "1200s"; nginx.ingress.kubernetes.io/ssl-redirect: "true" creationTimestamp: "2023-08-15T12:41:28Z" generation: 1 labels: app: echo-test app.kubernetes.io/instance: echo-test app.kubernetes.io/managed-by: calcasa-operator app.kubernetes.io/name: echo-test app.kubernetes.io/version: latest name: echo-test namespace: default ownerReferences: - apiVersion: cal.casa.eu/v1beta1 blockOwnerDeletion: true controller: true kind: CalcasaService name: echo-test uid: 6bd1d902-a589-4c65-b34b-3d9a1735b60f resourceVersion: "18049478" uid: c191ae29-c0ec-47d0-8c72-00a3dae00100 spec: ingressClassName: nginx rules: - host: echo-test.01.c.calcasa.nl http: paths: - backend: service: name: echo-test port: name: http path: / pathType: Prefix tls: - hosts: - echo-test.01.c.calcasa.nl secretName: echo-test-ingress-crt status: loadBalancer: ingress: - ip: {snip,public ip} ```
curl requests and response ``` curl https://{snip,host}/ Request echo! GET / HTTP/2.0 Host: {snip,host} Accept: */* User-Agent: curl/7.81.0 X-Forwarded-For: {snip,source ip} X-Forwarded-Host: {snip,host} X-Forwarded-Port: 443 X-Forwarded-Proto: https X-Forwarded-Scheme: https X-Real-Ip: {snip,source ip} X-Request-Id: 22221af42e08fbd6ca7c87add38c82c7 X-Scheme: https ```

That should contain a traceparent header, when you set nginx.ingress.kubernetes.io/backend-protocol to http and make the server a simple http echo, this does work.

package main

import (
    "fmt"
    "log"
    "net/http"
    "net/http/httputil"

    "golang.org/x/net/http2"
    "golang.org/x/net/http2/h2c"
)

func main() {
    h2s := &http2.Server{}

    handler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        log.Printf("%s [%s] %s %s", r.RemoteAddr, r.Method, r.Host, r.RequestURI)
        w.WriteHeader(http.StatusOK)
        w.Write([]byte("Request echo!\n\n"))
        req, err := httputil.DumpRequest(r, true)
        if err != nil {
            w.Write([]byte(fmt.Sprintf("%s", err)))
            return
        }
        w.Write(req)
    })
    addr := ":8080"
    s := &http.Server{
        Addr:    addr,
        Handler: h2c.NewHandler(handler, h2s),
    }

    log.Printf("Starting server on %s\n", addr)
    if err := s.ListenAndServe(); err != http.ErrServerClosed {
        log.Fatal(err)
    }
}

How to reproduce this issue:

Install some cluster, I doubt it matters (genuinely)

Install the ingress controller

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/main/deploy/static/provider/baremetal/deploy.yaml

Add the config map listed above to enable opentelemetry as config

Install an application that will act as default backend (is just an echo app)

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/main/docs/examples/http-svc.yaml

Create an ingress (please add any additional annotation required)

Create and ingress like above with the Grpc protocol

make a request

curl https://{domain}/

Anything else we need to know:

This probably means that grpc_pass does not respect opentelemetry_propagate (or vice versa). The config that is generated looks fine otherwise and the same endpoint can work for normal http1.1 requests to the same vhost.

k8s-ci-robot commented 1 year ago

This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
esigo commented 1 year ago

/assign

deejgregor commented 1 year ago

Thanks for opening up this issue--I forgot to open one when I noticed the same thing.

FWIW, here is the workaround we are using at the moment in Lokahi on our gRPC ingress:

nginx.ingress.kubernetes.io/configuration-snippet: |
  grpc_set_header 'traceparent' $opentelemetry_context_traceparent;

-- link

deejgregor commented 1 year ago

I'll also note that there's a similar issue with propagating to auth endpoints in #9811.

github-actions[bot] commented 1 year ago

This is stale, but we won't close it automatically, just bare in mind the maintainers may be busy with other tasks and will reach your issue ASAP. If you have any question or request to prioritize this, please reach #ingress-nginx-dev on Kubernetes Slack.

tronda commented 5 months ago

We are currently struggling with the same issue with our deployment.