amir20 / dozzle

Realtime log viewer for docker containers.
https://dozzle.dev/
MIT License
5.7k stars 287 forks source link

How do I get the agent to talk to containerd.sock using tls certs? #3108

Closed amir20 closed 2 months ago

amir20 commented 2 months ago

I'm not able to get agents to work in my kubernetes (k8s) environment. I'm using docker with containerd=/run/containerd/containerd.sock with tls certs, hence docker.socket is not available. In my scenario all docker communication is done over a tls connection to the containerd.sock. How do I get the agent to talk to containerd.sock using tls certs?

Originally posted by @dhop90 in https://github.com/amir20/dozzle/issues/3066#issuecomment-2226260919

amir20 commented 2 months ago

@dhop90 As far as I know, k8s, doesn't support Docker APIs. containerd.sock has different API and you cannot use it as docker.sock.

K8s removed Docker as a runtime. For Dozzle to work, it would need to implement the k8s API.

dhop90 commented 2 months ago

Thanks, that's what I figured. Put me down as a use case that would like to continue to use remote host since agent will not work in k8s

amir20 commented 2 months ago

Wait remote hosts work? How? They have the same API. This is news to me. Can you share your configuration?

dhop90 commented 2 months ago

I mount a directory with certs on the pod running dozzle @ /certs

DOZZLE_REMOTE_HOST=tcp://qnap.domain.duckdns.org:2375,tcp://kube0.domain.duckdns.org:2376,tcp://mini.domain.duckdns.org:2376,tcp://kube1.domain.duckdns.org:2376,tcp://kube2.domain.duckdns.org:2376,tcp://kube3.domain.duckdns.org:2376,tcp://kube4.domain.duckdns.org:2376,tcp://kube5.domain.duckdns.org:2376,tcp://kube6.domain.duckdns.org:2376,tcp://kube7.domain.duckdns.org:2376,tcp://kube8.domain.duckdns.org:2376,tcp://kube9.domain.duckdns.org:2376,tcp://kube10.domain.duckdns.org:2376,tcp://kube11.domain.duckdns.org:2376,tcp://pi-dns.domain.duckdns.org:2376,tcp://pi-gway.domain.duckdns.org:2376,tcp://pi-pool.domain.duckdns.org:2376,tcp://pi-homebridge.domain.duckdns.org:2376,tcp://pi-zeek.domain.duckdns.org:2376,tcp://dell.domain.duckdns.org:2376,tcp://thinkcentre.domain.duckdns.org:2376

@kube1:~ $ systemctl status docker ● docker.service - Docker Application Container Engine Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled) Active: active (running) since Sun 2024-05-05 02:17:25 CDT; 2 months 7 days ago TriggeredBy: ● docker.socket Docs: https://docs.docker.com Main PID: 656 (dockerd) Tasks: 48 Memory: 1.8G CPU: 1w 6d 7h 5min 11.345s CGroup: /system.slice/docker.service └─656 /usr/bin/dockerd -H fd:// -H tcp://0.0.0.0:2376 --containerd=/run/containerd/containerd.sock --tlsverify --tlscacert=/home/CA/ca.pem --tlscert=/home/CA/server-cert.pem --tlskey=/home/CA/server-key.pem

logs: time="2024-07-12T19:45:20Z" level=warning msg="Unexpected environment variable DOZZLE_PORT_80_TCP_ADDR" time="2024-07-12T19:45:20Z" level=warning msg="Unexpected environment variable DOZZLE_PORT" time="2024-07-12T19:45:20Z" level=warning msg="Unexpected environment variable DOZZLE_SERVICE_HOST" time="2024-07-12T19:45:20Z" level=warning msg="Unexpected environment variable DOZZLE_SERVICE_PORT_HTTP" time="2024-07-12T19:45:20Z" level=warning msg="Unexpected environment variable DOZZLE_PORT_80_TCP" time="2024-07-12T19:45:20Z" level=warning msg="Unexpected environment variable DOZZLE_PORT_80_TCP_PORT" time="2024-07-12T19:45:20Z" level=warning msg="Unexpected environment variable DOZZLE_SERVICE_PORT" time="2024-07-12T19:45:20Z" level=warning msg="Unexpected environment variable DOZZLE_PORT_80_TCP_PROTO" time="2024-07-12T19:45:20Z" level=info msg="Dozzle version v8.0.5" time="2024-07-12T19:45:20Z" level=warning msg="Remote host flag is deprecated and will be removed in future versions. Agents will replace remote hosts as a safer and performant option. See https://github.com/amir20/dozzle/issues/3066 for discussion." time="2024-07-12T19:45:20Z" level=info msg="Creating client for qnap.domain.duckdns.org with tcp://qnap.domain.duckdns.org:2375" time="2024-07-12T19:45:20Z" level=error msg="unable to get docker info: error during connect: Get \"https://qnap.domain.duckdns.org:2375/v1.46/info\": http: server gave HTTP response to HTTPS client" time="2024-07-12T19:45:20Z" level=warning msg="Could not connect to remote host tcp:qnap.domain.duckdns.org:2375: error during connect: Get \"https://qnap.domain.duckdns.org:2375/v1.46/containers/json?all=1\": http: server gave HTTP response to HTTPS client" time="2024-07-12T19:45:20Z" level=info msg="Creating client for kube0.domain.duckdns.org with tcp://kube0.domain.duckdns.org:2376" time="2024-07-12T19:45:20Z" level=info msg="Creating client for mini.domain.duckdns.org with tcp://mini.domain.duckdns.org:2376" time="2024-07-12T19:45:21Z" level=info msg="Creating client for kube1.domain.duckdns.org with tcp://kube1.domain.duckdns.org:2376" time="2024-07-12T19:45:24Z" level=info msg="Creating client for kube2.domain.duckdns.org with tcp://kube2.domain.duckdns.org:2376" time="2024-07-12T19:45:25Z" level=info msg="Creating client for kube3.domain.duckdns.org with tcp://kube3.domain.duckdns.org:2376" time="2024-07-12T19:45:26Z" level=info msg="Creating client for kube4.domain.duckdns.org with tcp://kube4.domain.duckdns.org:2376" time="2024-07-12T19:45:26Z" level=info msg="Creating client for kube5.domain.duckdns.org with tcp://kube5.domain.duckdns.org:2376" time="2024-07-12T19:45:28Z" level=info msg="Creating client for kube6.domain.duckdns.org with tcp://kube6.domain.duckdns.org:2376" time="2024-07-12T19:45:31Z" level=info msg="Creating client for kube7.domain.duckdns.org with tcp://kube7.domain.duckdns.org:2376" time="2024-07-12T19:45:31Z" level=info msg="Creating client for kube8.domain.duckdns.org with tcp://kube8.domain.duckdns.org:2376" time="2024-07-12T19:45:34Z" level=info msg="Creating client for kube9.domain.duckdns.org with tcp://kube9.domain.duckdns.org:2376" time="2024-07-12T19:45:37Z" level=info msg="Creating client for kube10.domain.duckdns.org with tcp://kube10.domain.duckdns.org:2376" time="2024-07-12T19:45:38Z" level=info msg="Creating client for kube11.domain.duckdns.org with tcp://kube11.domain.duckdns.org:2376"

amir20 commented 2 months ago

Ah nice! So you are not using docker-socket-proxy. What you did should actually work since DOCKER_HOST is still implemented in agent mode. But I haven't document it since I didn't think anybody would use it.

I need to do a little testing to update k8s support.

I am just a little confused because based on your configuration containerd.sock is compatible. I was under the impression this isn't true. Is that right? Did you enable something special?

Is this true? Does containerd.sock just work with Docker API?

If true, then you want to use swarm mode and point DOCKER_HOST to it.

dhop90 commented 2 months ago

It would seem that containerd.sock works with Docker API, I did not enable anything special. I did install cri-dockerd (https://github.com/Mirantis/cri-dockerd) This adapter provides a shim for Docker Engine that lets you control Docker via the Kubernetes Container Runtime Interface.

I'll try using swarm mode

amir20 commented 2 months ago

I use FromEnv which supports env variables:

// FromEnv uses the following environment variables:
//
//   - DOCKER_HOST ([EnvOverrideHost]) to set the URL to the docker server.
//   - DOCKER_API_VERSION ([EnvOverrideAPIVersion]) to set the version of the
//     API to use, leave empty for latest.
//   - DOCKER_CERT_PATH ([EnvOverrideCertPath]) to specify the directory from
//     which to load the TLS certificates ("ca.pem", "cert.pem", "key.pem').
//   - DOCKER_TLS_VERIFY ([EnvTLSVerify]) to enable or disable TLS verification

You can try setting up one agent with those env variable. You probably have to mount certs. If that works then in theory swarm should work. This would actually give you significant performance boost as the agents would handle all heavy work in swarm mode. And inter communication is all protobuf which is much faster than json.

I am still reading about Docker engine in k8s. I thought Docker was removed from k8s so I am a little confused what is happening. I'll have to setup k8s locally to really understand how it works.

I won't be able to look at it too much today, but if k8s works with DOCKER_HOST then that's pretty exciting. I would be able to support k8s natively in swarm mode.

dhop90 commented 2 months ago

When creating a single deployment pod using the following env variable:

        - name: DOCKER_HOST
          value: tcp://kube8.domain.duckdns.org:2376
        - name: DOCKER_CERT_PATH
          value: /certs
        - name: DOCKER_TLS_VERIFY
          value: 'false'
        - name: DOZZLE_MODE
          value: swarm
        - name: DOZZLE_LEVEL
          value: debug

I get the following log messages:

time="2024-07-12T22:25:06Z" level=info msg="Dozzle version v8.0.5" time="2024-07-12T22:25:06Z" level=debug msg="filterArgs = {map[]}" time="2024-07-12T22:25:07Z" level=debug msg="Creating a client with host: ID: dcbe52b3-46d8-4ef9-9a3e-6d42295b6835, Endpoint: local" time="2024-07-12T22:25:07Z" level=info msg="Starting in Swarm mode" time="2024-07-12T22:25:07Z" level=debug msg="subscribing to docker events from container store ID: dcbe52b3-46d8-4ef9-9a3e-6d42295b6835, Endpoint: local" time="2024-07-12T22:25:07Z" level=info msg="gRPC server listening on [::]:7007" time="2024-07-12T22:25:07Z" level=debug msg="subscribing to docker events from container store ID: dcbe52b3-46d8-4ef9-9a3e-6d42295b6835, Endpoint: local" time="2024-07-12T22:25:07Z" level=info msg="Accepting connections on :8080" time="2024-07-12T22:25:07Z" level=fatal msg="error looking up swarm services: lookup tasks.dozzle on 10.96.0.10:53: no such host"

the above log messages repeat.

what/where is tasks.dozzle configured?

I created a dozzle service using this manifest:

apiVersion: v1 kind: Service metadata: labels: io.kompose.service: dozzle name: dozzle namespace: dozzle spec: ports:

Any thoughts?

I also tried creating pods using a daemonset but I haven't figured out how to set DOCKER_HOST per pod

using these env variables:

I'm getting this error time="2024-07-12T22:43:40Z" level=fatal msg="Could not connect to local Docker Engine: error during connect: Get \"https://MY_HOSTNAME:2376/v1.46/info\": dial tcp: lookup MY_HOSTNAME on 10.96.0.10:53: no such host"

amir20 commented 2 months ago

tasks.dozzle

tasks.dozzle is how docker swarm sets a cluster DNS. That's how I discover all other instances. I think this would have to change for k8s.

From my earlier comment, I meant you should first create an agent with DOCKER_HOST and then connect to that using a regular Dozzle UI.

At the moment, I am trying to setup k8s to play around with it, but it's no easy task.

I am going to try k3s because it seems like it has conainderd out of the box. Once that happens I can try to help you more.

I am out of the office next week so this will have to wait a little. :)

But based on what you found, I would imagine UI --> multiple agents should work if you use DOCKER_HOST on each agent. 🎁

amir20 commented 2 months ago

I am reading more about cri-dockerd. Sounds like this bring docker back as a runtime. Is that right? Honestly, I am having a hard time setting up an environment to test k8s with containerd. Based on your screenshot above, you are running docker which is how Dozzle is working. But I am not too sure if I understand this would work with containerd out of the box.

amir20 commented 2 months ago

Hey, after learning a little more, I think you are over complicating things. With cri-dockerd, you should just be able to point to your /run/containerd/containerd.sock. containerd.sock is the same as docker.sock.

I don't know k8s, but you just want to mount /run/containerd/containerd.sock as /var/run/docker.sock and you don't need any TLS.

So something like docker run -v /run/containerd/containerd.so:/var/run/docker.sock -p 7007:7007 amir20/dozzle:latest agent should theoretically work.

I don't think swarm mode will work for you because k8s doesn't create DNS entries. Just create all the agents manually for now.

Maybe in the future I can figure out how to do DOZZLE_MODE=k8s-cri-dockerd mode which would query k8s for other IP addresses in k8s. It shouldn't be that much work.

I am off for the day! Let me know if it works.

amir20 commented 2 months ago

Not an expert but I was successfully able to get working in k8s and agent. This only work with Docker runtime of course.

My deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: dozzle
  labels:
    app: dozzle
spec:
  replicas: 1
  selector:
    matchLabels:
      app: dozzle
  template:
    metadata:
      labels:
        app: dozzle
    spec:
      containers:
        - name: dozzle
          image: amir20/dozzle:latest
          command: ["/dozzle"]
          args: ["agent"]
          ports:
            - containerPort: 8080
          volumeMounts:
            - name: docker-socket
              mountPath: /var/run/docker.sock
      volumes:
        - name: docker-socket
          hostPath:
            path: /var/run/docker.sock
            type: Socket
---
apiVersion: v1
kind: Service
metadata:
  name: dozzle-service
  labels:
    app: dozzle-service
spec:
  type: LoadBalancer
  selector:
    app: dozzle
  ports:
    - protocol: TCP
      port: 7007
      targetPort: 7007

Then I started without k8s in docker using docker run -p 8081:8080 amir20/dozzle:latest --remote-agent k3s.orb.local:7007. Dozzle was able to come up successfully and shows all logs with agent.

/var/run/docker.sock is where Docker is running.

Closing this @dhop90. Will eventually also support swarm mode in k8s if easy.

dhop90 commented 2 months ago

@amir20, you were correct, I was making it more complicated then it needed to be. Got it working with these manifests

------
apiVersion: apps/v1
kind: Deployment
metadata:
  name: dozzle
  namespace: dozzle
spec:
  replicas: 1
  selector:
    matchLabels:
      io.kompose.service: dozzle
  strategy:
    type: Recreate
  template:
    spec:
      containers:
        - env:
            - name: wud.watch
              value: "false"
            - name: DOZZLE_LEVEL
              value: debug
            - name: DOZZLE_REMOTE_AGENT
              value: kube0:7007,kube1:7007,kube2:7007,kube3:7007,kube4:7007,kube5:7007,kube6:7007,kube7:7007,kube8:7007,kube9:7007,kube10:7007,kube11:7007,dell:7007,mini:
7007,pi-zeek:7007,pi-dns:7007,pi-pool:7007,pi-homebridge:7007,pi-gway:7007,thinkcentre:7007,pi-dns:7007
          image: amir20/dozzle:v8.0.5
          imagePullPolicy: IfNotPresent
          name: dozzle
          ports:
            - containerPort: 8080
              protocol: TCP
          resources: {}
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
            - mountPath: /var/run/docker.sock
              name: dockersock
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
      volumes:
        - hostPath:
            path: /run/containerd/containerd.sock
          name: dockersock
------
apiVersion: apps/v1
kind: DaemonSet
metadata:
  annotations:
    kompose.cmd: ../manifests/kompose/kompose convert --controller daemonSet
    kompose.version: 1.33.0 (3ce457399)
  labels:
    io.kompose.service: dozzle
  name: dozzle-agent
  namespace: dozzle
spec:
  selector:
    matchLabels:
      io.kompose.service: dozzle-agent
  template:
    metadata:
      labels:
        io.kompose.network/agent-default: "true"
        io.kompose.service: dozzle-agent
    spec:
      containers:
        - args:
            - agent
          image: amir20/dozzle:v8.0.5
          name: dozzle-agent
          ports:
            - containerPort: 7007
              hostPort: 7007
              protocol: TCP
          volumeMounts:
            - mountPath: /var/run/docker.sock
              name: dockersock
      restartPolicy: Always
      volumes:
        - hostPath:
            path: /var/run/docker.sock
          name: dockersock
      tolerations:
        - key: node-role.kubernetes.io/control-plane
          effect: NoSchedule
amir20 commented 2 months ago

🚀 I am astonished by how much configuration there is k8s. If you get a chance, I think an addition to docs would be amazing.