traefik / traefik-helm-chart

Traefik Proxy Helm Chart
https://traefik.io
Apache License 2.0
1.01k stars 743 forks source link

Using Tailscale as certificate resolvers #980

Closed jpabbuehl closed 5 months ago

jpabbuehl commented 6 months ago

Welcome!

What version of the Traefik's Helm Chart are you using?

26.0.0.0

What version of Traefik are you using?

v3.0.0-beta3

What did you do?

copying this manifest into k3s watching directory /var/lib/rancher/k3s/server/manifests/traefik.yaml

apiVersion: helm.cattle.io/v1
kind: HelmChart
metadata:
  name: traefik
  namespace: kube-system
spec:
  repo: https://traefik.github.io/charts
  chart: traefik
  version: 26.0.0
  set:
    global.systemDefaultRegistry: ""
  valuesContent: |-
    tag: "v3.0.0-beta3"
    experimental:
      plugins:
        sablier:
          moduleName: github.com/acouvreur/sablier"
          version: v1.4.1-beta.3
    http3:
        enabled: true
    rbac:
      enabled: true
    ports:
      websecure:
        tls:
          enabled: true
        http3:
          enabled: true
    podAnnotations:
      prometheus.io/port: "8082"
      prometheus.io/scrap: "true"
    providers:
      kubernetesCRD:
        allowCrossNamespace: true
        allowExternalNameServices: true
      kubernetesIngress:
        publishedService:
          enabled: true
    priorityClassName: "system-cluster-critical"
    tolerations:
    - key: "CriticalAddonsOnly"
      operator: "Exists"
    - key: "node-role.kubernetes.io/control-plane"
      operator: "Exists"
      effect: "NoSchedule"
    - key: "node-role.kubernetes.io/master"
      operator: "Exists"
      effect: "NoSchedule"
    additionalArguments:
    - '--certificatesresolvers.myresolver.tailscale=true'

What did you see instead?

kubectl logs -n kube-system traefik-7b76d5b4ff-srg5v
2023/12/14 13:31:11 command traefik error: failed to decode configuration from flags: field not found, node: tailscale

What is your environment & configuration?

k3s v1.28.4+k3s2 ubuntu 22.04

Additional Information

I'm trying to replicate this blogpost from Traefik https://traefik.io/blog/exploring-the-tailscale-traefik-proxy-integration/

Is tailscale supported in Traefik Helm Chart? Doesn't seem so when looking at values.yaml

https://github.com/traefik/traefik-helm-chart/blob/ed7e8bbdd56b852ef3163c579cbbba686a4f700c/traefik/values.yaml#L853

Thank you a lot in advance

mloiseleur commented 6 months ago

As you noticed, tailscale is working only in v3.

You may be interested to replace

    tag: "v3.0.0-beta3"

with

    image:
      tag: "v3.0.0-beta3"
jpabbuehl commented 6 months ago

Thanks @mloiseleur

I've updated accordingly (using the latest image to be future-ready with apiversion naming changes) and mounted tailscaled.sock (undocumented step but apparently required, otherwise I get file not found).

Now facing permission denied and I'm unsure how to properly fix permission in the Helm chart values.yaml

here's the log from traefik pod

ERR github.com/traefik/traefik/v3/pkg/provider/tailscale/provider.go:249 > Unable to fetch certificate for domain "master.XXX.ts.net" error="Get \"http://local-tailscaled.sock/localapi/v0/cert/master.XXX.ts.net?type=pair\": dial unix /var/run/tailscale/tailscaled.sock: connect: permission denied" providerName=myresolver.tailscale

here's the ownership/permission of the tailscale.sock on the host

ls -la /run/tailscale/tailscaled.sock
srw-rw-rw- 1 root root 0 Dec 17 08:23 /run/tailscale/tailscaled.sock

here's the updated chart

apiVersion: helm.cattle.io/v1
kind: HelmChart
metadata:
  name: traefik
  namespace: kube-system
spec:
  repo: https://traefik.github.io/charts
  chart: traefik
  version: 26.0.0
  set:
    global.systemDefaultRegistry: ""
  valuesContent: |-
    image:
      tag: "v3.0.0-beta5"
    deployment:
      additionalVolumes:
      - name: tailscale
        hostPath:
          path: /run/tailscale/tailscaled.sock
    additionalVolumeMounts:
    - name: tailscale
      mountPath: /var/run/tailscale/tailscaled.sock
    experimental:
      plugins:
        sablier:
          moduleName: github.com/acouvreur/sablier
          version: v1.5.0
    http3:
        enabled: true
    rbac:
      enabled: true
    ports:
      websecure:
        tls:
          enabled: true
        http3:
          enabled: true
    podAnnotations:
      prometheus.io/port: "8082"
      prometheus.io/scrap: "true"
    providers:
      kubernetesCRD:
        allowCrossNamespace: true
        allowExternalNameServices: true
      kubernetesIngress:
        publishedService:
          enabled: true
    priorityClassName: "system-cluster-critical"
    tolerations:
    - key: "CriticalAddonsOnly"
      operator: "Exists"
    - key: "node-role.kubernetes.io/control-plane"
      operator: "Exists"
      effect: "NoSchedule"
    - key: "node-role.kubernetes.io/master"
      operator: "Exists"
      effect: "NoSchedule"
    additionalArguments:
    - '--certificatesresolvers.myresolver.tailscale=true'
    logs:
      general:
        level: DEBUG

here's my test setup

apiVersion: apps/v1
kind: Deployment
metadata:
  name: whoami
  namespace: tailscale
  labels:
    app: whoami
spec:
  replicas: 1
  selector:
    matchLabels:
      app: whoami
  template:
    metadata:
      labels:
        app: whoami
    spec:
      containers:
        - name: whoami
          image: traefik/whoami
          env:
            - name: WHOAMI_PORT_NUMBER
              value: "3000"
            - name: WHOAMI_NAME
              value: "testing"
          ports:
            - containerPort: 3000
---
apiVersion: v1
kind: Service
metadata:
  name: whoami-svc
  namespace: tailscale
spec:
  selector:
    app: whoami
  ports:
    - protocol: TCP
      port: 3000
      targetPort: 3000
  type: ClusterIP
---
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
  name: whoami-tailscale-ingress
  namespace: tailscale
spec:
  entryPoints:
    - websecure
  routes:
    - match: Host(`master.XXX.ts.net`)
      kind: Rule
      services:
        - name: whoami-svc
          port: 3000
  tls:
    certResolver: myresolver

I've tested the ingressroute with entrypoint web and without tls, and was able to access whoami in the browser with http://master.XXX.ts.net

Thanks in advance JP

mloiseleur commented 6 months ago

The file /var/run/tailscale/tailscaled.sock is an IPC socket. It should not be created by the user. It should come from a tailscale daemon, and this is the key to connect traefik to the tailscale daemon, by sharing this socket.

Traefik is run securely by default, as a non-root user, so either (recommended) an initContainer can set correct non-root permission or (insecure) Traefik is run with root user.

jpabbuehl commented 6 months ago

Thanks. the tailscaled.sock is already created from tailscale daemon.

I've updated the chart according to your advice and now there is an access denied issue. Hard to know where is the issue coming from (traefik or tailscale). I've tried to init /data/acme.json with persistence but it didn't help. After few failures, traefik fallbacks TRAEFIK DEFAULT CERT

logs

2023-12-19T09:44:49Z ERR github.com/traefik/traefik/v3/pkg/provider/tailscale/provider.go:249 > Unable to fetch certificate for domain "master.XXX.ts.net" error="Access denied: cert access denied" providerName=myresolver.tailscale

updated chart

apiVersion: helm.cattle.io/v1
kind: HelmChart
metadata:
  name: traefik
  namespace: kube-system
spec:
  repo: https://traefik.github.io/charts
  chart: traefik
  version: 26.0.0
  set:
    global.systemDefaultRegistry: ""
  valuesContent: |-
    image:
      tag: "v3.0.0-beta5"
    deployment:
      additionalVolumes:
      - name: tailscale-ipc
        hostPath:
          path: /var/run/tailscale
      initContainers:
      - name: adjust-tailscale-socket-permissions
        image: busybox
        command: ["sh", "-c", "chmod 666 /var/run/tailscale/tailscaled.sock"]
        securityContext:
          runAsNonRoot: false
          runAsGroup: 0
          runAsUser: 0
        volumeMounts:
          - name: tailscale-ipc
            mountPath: /var/run/tailscale
    additionalVolumeMounts:
    - name: tailscale-ipc
      mountPath: /var/run/tailscale
    experimental:
      plugins:
        sablier:
          moduleName: github.com/acouvreur/sablier
          version: v1.5.0
    http3:
        enabled: true
    rbac:
      enabled: true
    ports:
      websecure:
        tls:
          enabled: true
        http3:
          enabled: true
    podAnnotations:
      prometheus.io/port: "8082"
      prometheus.io/scrap: "true"
    providers:
      kubernetesCRD:
        allowCrossNamespace: true
        allowExternalNameServices: true
      kubernetesIngress:
        publishedService:
          enabled: true
    priorityClassName: "system-cluster-critical"
    tolerations:
    - key: "CriticalAddonsOnly"
      operator: "Exists"
    - key: "node-role.kubernetes.io/control-plane"
      operator: "Exists"
      effect: "NoSchedule"
    - key: "node-role.kubernetes.io/master"
      operator: "Exists"
      effect: "NoSchedule"
    additionalArguments:
    - '--certificatesresolvers.myresolver.tailscale=true'
    logs:
      general:
        level: DEBUG
mloiseleur commented 6 months ago

This error message is triggered here in traefik code:

        cert, key, err := tscert.CertPair(ctx, domain)
        if err != nil {
            logger.Error().Err(err).Msgf("Unable to fetch certificate for domain %q", domain)
            continue
        }

It does not seem to have reached the point where traefik tries to store it. It's triggered by tscert lib. => Maybe it's an authorization issue with tailscale

mloiseleur commented 5 months ago

Without further information, I close this issue. Feel free to re-open it if needed.

east4ming commented 1 week ago

My answer, works fine(I have tried for the right answer for a half day😂). The key is podSecurityContext and securityContext, make sure runAsUser: 0(root):

apiVersion: helm.cattle.io/v1
kind: HelmChart
metadata:
  name: traefik
  namespace: traefik
spec:
  repo: https://traefik.github.io/charts
  chart: traefik
  targetNamespace: traefik
  valuesContent: |-
    namespaceOverride: traefik
    additionalArguments:
      - "--certificatesresolvers.tailscaleresolver.tailscale=true"
    experimental:
      kubernetesGateway:
        enabled: true
      namespacePolicy: All
      namespace: traefik
    deployment:
      additionalVolumes:
      - name: tailscale-socket
        hostPath:
          path: /run/tailscale/tailscaled.sock
          type: Socket
    additionalVolumeMounts:
    - name: tailscale-socket
      mountPath: /var/run/tailscale/tailscaled.sock
    ports:
      traefik:
        expose:
          default: true
      web:
        redirectTo:
          port: websecure
          priority: 10
          permanent: true
      websecure:
        tls:
          certResolver: "tailscaleresolver"
        http3:
          enabled: true
    tlsOptions:
      default:
        minVersion: VersionTLS13
    securityContext:
      runAsUser: 0
    podSecurityContext:
      runAsGroup: 0
      runAsNonRoot: false
      runAsUser: 0