kubernetes / ingress-nginx

Ingress NGINX Controller for Kubernetes
https://kubernetes.github.io/ingress-nginx/
Apache License 2.0
17.48k stars 8.25k forks source link

OpenTelemetry reported "unknown variable" before first Ingress #11044

Open mtslzr opened 8 months ago

mtslzr commented 8 months ago

What happened:

While doing some Disaster Recovery testing, we re-deployed ingress-nginx with the following http-snippet to log trace IDs (pulled from https://github.com/opentracing-contrib/nginx-opentracing/issues/33):

http-snippet: |
        map $opentelemetry_context_traceparent $ot_trace_id {
          default "00000000000000000000000000000000";
          ~^00-(?<trace_id>[^-]+)-(?<parent_id>[^-]+)-(?<trace_flags>[0-9]+)$ "$trace_id";
        }
        map $opentelemetry_context_traceparent $ot_span_id {
          default "0000000000000000";
          ~^00-(?<trace_id>[^-]+)-(?<parent_id>[^-]+)-(?<trace_flags>[0-9]+)$ "$parent_id";
        }
        map $opentelemetry_context_traceparent $ot_trace_flags {
          default "00";
          ~^00-(?<trace_id>[^-]+)-(?<parent_id>[^-]+)-(?<trace_flags>[0-9]+)$ "$trace_flags";
        }

When it spins up, until the first Ingress is created, the opentelemetry_ variables are all unknown, resulting in the following error:

Error: exit status 1
2024/02/29 15:55:17 [emerg] 26#26: unknown "opentelemetry_context_traceparent" variable
nginx: [emerg] unknown "opentelemetry_context_traceparent" variable
nginx: configuration file /tmp/nginx/nginx-cfg966897373 test failed

What you expected to happen:

I would prefer that setting the following makes the opentelemetry_ variable available from the start, instead of waiting for the first Ingress:

opentelemetry:
  enabled: true

Alternatively, I've tried figuring out a workaround to just ignore the variable if unavailable using map and could not find anything workable, and know that's likely outside the scope of this issue.

NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.):

NGINX Ingress controller
  Release:       v1.9.6
  Build:         6a73aa3b05040a97ef8213675a16142a9c95952a
  Repository:    https://github.com/kubernetes/ingress-nginx
  nginx version: nginx/1.21.6

Kubernetes version (use kubectl version):

(This is from testing locally with minikube)

Client Version: v1.29.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.3

Environment:

❯ kubectl get nodes -o wide
NAME                              STATUS   ROLES   AGE     VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
aks-default-vmss0004rv   Ready    agent   2d13h   v1.28.5   10.1.10.179   <none>        Ubuntu 22.04.3 LTS   5.15.0-1054-azure   containerd://1.7.7-1
aks-default-vmss0004tm   Ready    agent   2d13h   v1.28.5   10.1.1.45     <none>        Ubuntu 22.04.3 LTS   5.15.0-1054-azure   containerd://1.7.7-1

Rendered using helm template either locally for testing or via Gitlab CI, and then pushed to a controller repo and consumed by Argo (or applied locally to minikube for testing).

Name:         nginx
Labels:       app.kubernetes.io/component=controller
              app.kubernetes.io/instance=nginx-ingress
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=ingress-nginx
              app.kubernetes.io/part-of=ingress-nginx
              app.kubernetes.io/version=1.9.6
              helm.sh/chart=ingress-nginx-4.9.1
Annotations:  ingressclass.kubernetes.io/is-default-class: true
Controller:   k8s.io/ingress-nginx
Events:       <none>
NAMESPACE     NAME                                               READY   STATUS    RESTARTS        AGE   IP             NODE       NOMINATED NODE   READINESS GATES
ingress       pod/nginx-ingress-ingress-nginx-controller-4sdxg   0/1     Running   8 (5m38s ago)   18m   10.244.0.17    minikube   <none>           <none>
kube-system   pod/coredns-5dd5756b68-pkrjg                       1/1     Running   0               13d   10.244.0.2     minikube   <none>           <none>
kube-system   pod/etcd-minikube                                  1/1     Running   0               13d   192.168.49.2   minikube   <none>           <none>
kube-system   pod/kube-apiserver-minikube                        1/1     Running   0               13d   192.168.49.2   minikube   <none>           <none>
kube-system   pod/kube-controller-manager-minikube               1/1     Running   0               13d   192.168.49.2   minikube   <none>           <none>
kube-system   pod/kube-proxy-6rb6q                               1/1     Running   0               13d   192.168.49.2   minikube   <none>           <none>
kube-system   pod/kube-scheduler-minikube                        1/1     Running   0               13d   192.168.49.2   minikube   <none>           <none>
kube-system   pod/storage-provisioner                            1/1     Running   1 (13d ago)     13d   192.168.49.2   minikube   <none>           <none>

NAMESPACE     NAME                                                       TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)                      AGE   SELECTOR
default       service/kubernetes                                         ClusterIP      10.96.0.1        <none>        443/TCP                      13d   <none>
ingress       service/nginx-ingress-ingress-nginx-controller             LoadBalancer   10.108.108.1     <pending>     80:30355/TCP,443:30322/TCP   8d    app.kubernetes.io/component=controller,app.kubernetes.io/instance=nginx-ingress,app.kubernetes.io/name=ingress-nginx
ingress       service/nginx-ingress-ingress-nginx-controller-admission   ClusterIP      10.100.100.183   <none>        443/TCP                      8d    app.kubernetes.io/component=controller,app.kubernetes.io/instance=nginx-ingress,app.kubernetes.io/name=ingress-nginx
ingress       service/nginx-ingress-ingress-nginx-controller-metrics     ClusterIP      10.103.113.179   <none>        10254/TCP                    8d    app.kubernetes.io/component=controller,app.kubernetes.io/instance=nginx-ingress,app.kubernetes.io/name=ingress-nginx
kube-system   service/kube-dns                                           ClusterIP      10.96.0.10       <none>        53/UDP,53/TCP,9153/TCP       13d   k8s-app=kube-dns

NAMESPACE     NAME                                                    DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE   CONTAINERS   IMAGES                                                                                                                    SELECTOR
ingress       daemonset.apps/nginx-ingress-ingress-nginx-controller   1         1         0       1            0           kubernetes.io/os=linux   8d    controller   registry.k8s.io/ingress-nginx/controller:v1.9.6@sha256:1405cc613bd95b2c6edd8b2a152510ae91c7e62aea4698500d23b2145960ab9c   app.kubernetes.io/component=controller,app.kubernetes.io/instance=nginx-ingress,app.kubernetes.io/name=ingress-nginx
kube-system   daemonset.apps/kube-proxy                               1         1         1       1            1           kubernetes.io/os=linux   13d   kube-proxy   registry.k8s.io/kube-proxy:v1.28.3                                                                                        k8s-app=kube-proxy

NAMESPACE     NAME                      READY   UP-TO-DATE   AVAILABLE   AGE   CONTAINERS   IMAGES                                    SELECTOR
kube-system   deployment.apps/coredns   1/1     1            1           13d   coredns      registry.k8s.io/coredns/coredns:v1.10.1   k8s-app=kube-dns

NAMESPACE     NAME                                 DESIRED   CURRENT   READY   AGE   CONTAINERS   IMAGES                                    SELECTOR
kube-system   replicaset.apps/coredns-5dd5756b68   1         1         1       13d   coredns      registry.k8s.io/coredns/coredns:v1.10.1   k8s-app=kube-dns,pod-template-hash=5dd5756b68

Warning RELOAD 19m nginx-ingress-controller Error reloading NGINX:

Error: exit status 1 2024/02/29 15:55:14 [emerg] 25#25: unknown "opentelemetry_context_traceparent" variable nginx: [emerg] unknown "opentelemetry_context_traceparent" variable nginx: configuration file /tmp/nginx/nginx-cfg923001734 test failed


Warning RELOAD 19m nginx-ingress-controller Error reloading NGINX:

Error: exit status 1 2024/02/29 15:55:17 [emerg] 26#26: unknown "opentelemetry_context_traceparent" variable nginx: [emerg] unknown "opentelemetry_context_traceparent" variable nginx: configuration file /tmp/nginx/nginx-cfg966897373 test failed


  - `kubectl -n <ingresscontrollernamespace> describe svc <ingresscontrollerservicename>`

Name: nginx-ingress-ingress-nginx-controller Namespace: ingress Labels: app.kubernetes.io/component=controller app.kubernetes.io/instance=nginx-ingress app.kubernetes.io/managed-by=Helm app.kubernetes.io/name=ingress-nginx app.kubernetes.io/part-of=ingress-nginx app.kubernetes.io/version=1.9.6 helm.sh/chart=ingress-nginx-4.9.1 Annotations: service.beta.kubernetes.io/azure-load-balancer-resource-group: kubernetes-development service.beta.kubernetes.io/azure-pip-name: nginx-ingress-pip Selector: app.kubernetes.io/component=controller,app.kubernetes.io/instance=nginx-ingress,app.kubernetes.io/name=ingress-nginx Type: LoadBalancer IP Family Policy: SingleStack IP Families: IPv4 IP: 10.108.108.1 IPs: 10.108.108.1 Port: http 80/TCP TargetPort: http/TCP NodePort: http 30355/TCP Endpoints: Port: https 443/TCP TargetPort: https/TCP NodePort: https 30322/TCP Endpoints: Session Affinity: None External Traffic Policy: Local HealthCheck NodePort: 32131 Events:


- **Current state of ingress object, if applicable**:
  - `kubectl -n <appnamespace> get all,ing -o wide`

NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/nginx-ingress-ingress-nginx-controller-4sdxg 0/1 Running 9 (80s ago) 21m 10.244.0.17 minikube

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR service/nginx-ingress-ingress-nginx-controller LoadBalancer 10.108.108.1 80:30355/TCP,443:30322/TCP 8d app.kubernetes.io/component=controller,app.kubernetes.io/instance=nginx-ingress,app.kubernetes.io/name=ingress-nginx service/nginx-ingress-ingress-nginx-controller-admission ClusterIP 10.100.100.183 443/TCP 8d app.kubernetes.io/component=controller,app.kubernetes.io/instance=nginx-ingress,app.kubernetes.io/name=ingress-nginx service/nginx-ingress-ingress-nginx-controller-metrics ClusterIP 10.103.113.179 10254/TCP 8d app.kubernetes.io/component=controller,app.kubernetes.io/instance=nginx-ingress,app.kubernetes.io/name=ingress-nginx

NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE CONTAINERS IMAGES SELECTOR daemonset.apps/nginx-ingress-ingress-nginx-controller 1 1 0 1 0 kubernetes.io/os=linux 8d controller registry.k8s.io/ingress-nginx/controller:v1.9.6@sha256:1405cc613bd95b2c6edd8b2a152510ae91c7e62aea4698500d23b2145960ab9c app.kubernetes.io/component=controller,app.kubernetes.io/instance=nginx-ingress,app.kubernetes.io/name=ingress-nginx


  - `kubectl -n <appnamespace> describe ing <ingressname>`
  - If applicable, then, your complete and exact curl/grpcurl command (redacted if required) and the reponse to the curl/grpcurl command with the -v flag

- **Others**:
  - Any other related information like ;
    - copy/paste of the snippet (if applicable)
    - `kubectl describe ...` of any custom configmap(s) created and in use
    - Any other related information that may help

**How to reproduce this issue**:

Install minikube locally. Set up the following Helm chart:

apiVersion: v2 name: my-nginx-ingress description: Nginx Ingress type: application dependencies:

And set the following in values.yaml:

ingress-nginx:
  controller:
    ...
    opentelemetry:
      enabled: true

      # from: https://github.com/opentracing-contrib/nginx-opentracing/issues/33
      http-snippet: |
        map $opentelemetry_context_traceparent $ot_trace_id {
          default "00000000000000000000000000000000";
          ~^00-(?<trace_id>[^-]+)-(?<parent_id>[^-]+)-(?<trace_flags>[0-9]+)$ "$trace_id";
        }
        map $opentelemetry_context_traceparent $ot_span_id {
          default "0000000000000000";
          ~^00-(?<trace_id>[^-]+)-(?<parent_id>[^-]+)-(?<trace_flags>[0-9]+)$ "$parent_id";
        }
        map $opentelemetry_context_traceparent $ot_trace_flags {
          default "00";
          ~^00-(?<trace_id>[^-]+)-(?<parent_id>[^-]+)-(?<trace_flags>[0-9]+)$ "$trace_flags";
        }

Render it went helm template and then kube apply -f. When the pod comes up, kubectl logs to see the above errors.

Anything else we need to know:

Not sure if the request is nonsensical, so open to a work-around to just... ignore the issue until an Ingress is created, but it's been a bit of an annoyance during testing.

k8s-ci-robot commented 8 months ago

This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
longwuyuan commented 8 months ago

/assign @esigo

esigo commented 8 months ago

we can't set the context without a request. It's meaningless to have the variable set.you'd need to find a workaround. Maybe sth similar to https://github.com/kubernetes/ingress-nginx/issues/9811#issuecomment-1586028266.

longwuyuan commented 8 months ago

/remove-kind bug /kind support

github-actions[bot] commented 7 months ago

This is stale, but we won't close it automatically, just bare in mind the maintainers may be busy with other tasks and will reach your issue ASAP. If you have any question or request to prioritize this, please reach #ingress-nginx-dev on Kubernetes Slack.