netbirdio / netbird

Connect your devices into a secure WireGuard®-based overlay network with SSO, MFA and granular access controls.
https://netbird.io
BSD 3-Clause "New" or "Revised" License
11.03k stars 508 forks source link

Kubernetes setup? #223

Open KlavsKlavsen opened 2 years ago

KlavsKlavsen commented 2 years ago

I understand there should be docker images of latest release.. do you also have some example of how this can be run on a Kubernetes cluster?

braginini commented 2 years ago

hey @KlavsKlavsen We will prepare an example. I think we could share it by the end of the next week. Sounds good?

KlavsKlavsen commented 2 years ago

Fantastic - and we'll gladly test and give feedback/recommendations :) We would be fine with just a deployment.. we would prefer if no PVC is needed - ie. wireguard information should be in a secret (if sensitive) and rest in configmap. If we get it to work, we'll gladly write a Helm chart to make it easier to install and for you to update (and users to ensure they are updated) - and I can show you how to set it up, so your github repo, also can act as Helm repo if you like.

josepowera commented 2 years ago

@braginini Is kubernetes example available (could you help with where to find it)?

braginini commented 2 years ago

hey @josepowera We don't have any Kubernetes examples yet and this is on our to-do list. What is your use case? We are happy to discuss it in Slack. FYI: NetBird can already run in docker, see docs

Slyke commented 2 years ago

Once I setup my multimaster environment I can provide a basic example for a Kubernetes setup. It'll just be a yaml file, no helm charts or anything. Currently dealing with containerd issues on latest version of Ubuntu and v1.23.4 of K8s.

In the meantime, if you can provide a docker-compose file with example variables for storage and env vars I will convert it as soon as I'm able to test.

luafanti commented 2 years ago

@Slyke did you manage to deal with it? Could you share your configuration for k8s?

Slyke commented 2 years ago

I managed to get my K8s cluster back up and running. Haven't tried NetBird yet, but if you have a docker-compose file for it I can attempt it. I couldn't find one in the github repo, it looks like it's generated on the fly.

stobias123 commented 1 year ago

For any who find this thread after me, this worked fine in k8s.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: netbird
spec:
  selector:
    matchLabels:
      app: netbird
  replicas: 1
  template:
    metadata:
      labels:
        app: netbird
    spec:
      containers:
        - name: netbird
          image: netbirdio/netbird:latest
          env:
            - name: NB_SETUP_KEY
              value: fooo-reusable-key
          volumeMounts:
            - name: netbird-client
              mountPath: /etc/netbird
          resources:
            limits:
              memory: "128Mi"
              cpu: "500m"
          securityContext:
            privileged: true
            runAsUser: 0
            runAsGroup: 0
            capabilities:
              add:
                - NET_ADMIN
      volumes:
        - name: netbird-client
          emptyDir: {}
Slyke commented 1 year ago

Hey @stobias123 you might want to update your YAML to this

apiVersion: apps/v1
kind: Deployment
metadata:
  name: netbird
spec:
  selector:
    matchLabels:
      app: netbird
  replicas: 1
  template:
    metadata:
      labels:
        app: netbird
    spec:
      containers:
        - name: netbird
          image: netbirdio/netbird:0.12.0 # <--- Changed, current version.
          imagePullPolicy: IfNotPresent # <--- Changed
          env:
            - name: NB_SETUP_KEY
              value: fooo-reusable-key
          volumeMounts:
            - name: netbird-client
              mountPath: /etc/netbird
          resources:
            requests: # <--- Changed
              memory: "128Mi"
              cpu: "500m"
          securityContext:
            privileged: true
            runAsUser: 0
            runAsGroup: 0
            capabilities:
              add:
                - NET_ADMIN
      volumes:
        - name: netbird-client
          emptyDir: {}

Unless you have specific need, you should always use requests instead of limit.

And the reason for specifying the specific docker version is so that your setup won't break if the developer releases a new version on dockerhub that is not compatible with previous versions.

MohammedNoureldin commented 1 year ago

For any who find this thread after me, this worked fine in k8s.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: netbird
spec:
  selector:
    matchLabels:
      app: netbird
  replicas: 1
  template:
    metadata:
      labels:
        app: netbird
    spec:
      containers:
        - name: netbird
          image: netbirdio/netbird:latest
          env:
            - name: NB_SETUP_KEY
              value: fooo-reusable-key
          volumeMounts:
            - name: netbird-client
              mountPath: /etc/netbird
          resources:
            limits:
              memory: "128Mi"
              cpu: "500m"
          securityContext:
            privileged: true
            runAsUser: 0
            runAsGroup: 0
            capabilities:
              add:
                - NET_ADMIN
      volumes:
        - name: netbird-client
          emptyDir: {}

Is this about running a netbird client or server?

ashish1099 commented 1 year ago

For any who find this thread after me, this worked fine in k8s.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: netbird
spec:
  selector:
    matchLabels:
      app: netbird
  replicas: 1
  template:
    metadata:
      labels:
        app: netbird
    spec:
      containers:
        - name: netbird
          image: netbirdio/netbird:latest
          env:
            - name: NB_SETUP_KEY
              value: fooo-reusable-key
          volumeMounts:
            - name: netbird-client
              mountPath: /etc/netbird
          resources:
            limits:
              memory: "128Mi"
              cpu: "500m"
          securityContext:
            privileged: true
            runAsUser: 0
            runAsGroup: 0
            capabilities:
              add:
                - NET_ADMIN
      volumes:
        - name: netbird-client
          emptyDir: {}

Is this about running a netbird client or server?

this is for client and not the server

MohammedNoureldin commented 1 year ago

Hi @braginini, Is there any work to provide official Helm Chart (for Netbird Server) with some docs?

I may work on writing a Helm Chart if any of the developers or at least anyone who is a bit familiar with Netbird can work with me on it. At least just to get the basics working then I can work further alone. I should be able to provide it within a few days if anyone can support me. Please anybody interested, let me know.

KlavsKlavsen commented 1 year ago

I'd suggest a netbird-client and a netbird-server helm chart. We only use netbird-client to connect k8s to our VPN network - but the server is placed on a seperate VM - as we'd need that to have access to recover the k8s clusters.. (cannot recover a cluster if it runs the netbird server and we thus cannot access the internal network if its down :) Others may have other use cases ofcourse - but we'll gladly submit a netbird-client helm chart if this project wants to merge it.

MohammedNoureldin commented 1 year ago

Why do all examples here run the container as privileged?

TJbredow commented 1 year ago

Why do all examples here run the container as privileged?

Building wg interfaces and mutating the kernel routing table generally requires root permissions, and this deployment is ultimately building an interface on the host machine. The issue here is not simply getting it running, but having it provide more use than just a connection and what is the end goal, which would vary based on your CNI configuration. For a server implementation, I would suggest compiling a list of supported CNI providers and build default functionality, such as IP forwarding, advertisement of the Pod IPpools or Service CIDR...etc. To be honest, NetBird has 90% of the functionality a CNI provides, if you don't mind the cryptographic overhead between K8's Nodes.

Eisaichen commented 11 months ago

This is the yaml file I'm using for the server.

A limitation is that k8s cannot expose a range of ports, so the coturn server has to use the host network and you better set up the IP address for use in the turnserver.conf. Other than that, Worked very well for me with traefik and zitadel.

You basically only need one folder and two config files to run the netbird server: An empty folder or pvc for persistent data storage. management.json and turn.conf, you can find those in /infrastructure_files.

Recommend to un-comment "no-tcp" in the turnserver.conf (Line#388) Not-recommend to run the clients on k8s, because the the k8s cluster network is very much not defended, open a portal inside may not very ideal.

netbird.yaml ``` apiVersion: apps/v1 kind: Deployment metadata: name: netbird labels: app: netbird app.kubernetes.io/name: netbird spec: selector: matchLabels: app: netbird replicas: 1 template: metadata: labels: app: netbird spec: containers: - name: netbird-front image: docker.io/wiretrustee/dashboard:latest imagePullPolicy: "Always" ports: - containerPort: 80 name: front-http envFrom: - configMapRef: name: netbird livenessProbe: tcpSocket: port: front-http initialDelaySeconds: 20 periodSeconds: 15 timeoutSeconds: 5 failureThreshold: 5 - name: netbird-back image: docker.io/netbirdio/management:latest imagePullPolicy: "Always" securityContext: runAsUser: 1000 runAsGroup: 1000 ports: - containerPort: 8180 name: back-http args: [ "--port", "8180", "--log-file", "console", "--disable-anonymous-metrics=false", "--single-account-mode-domain=example.netbird", "--dns-domain=example.netbird" ] volumeMounts: - name: conf mountPath: /etc/netbird/management.json - name: data mountPath: /var/lib/netbird livenessProbe: tcpSocket: port: back-http initialDelaySeconds: 20 periodSeconds: 15 timeoutSeconds: 5 failureThreshold: 5 enableServiceLinks: false volumes: - name: conf hostPath: path: /srv/containers/netbird/management.json type: File - name: data hostPath: path: /srv/containers/netbird/data type: DirectoryOrCreate --- apiVersion: v1 kind: ConfigMap metadata: name: netbird data: NGINX_SSL_PORT: "8043" NETBIRD_MGMT_API_ENDPOINT: "https://vpn.example.com" NETBIRD_MGMT_GRPC_API_ENDPOINT: "https://vpn.example.com" AUTH_AUDIENCE: "*********@netbird" AUTH_CLIENT_ID: "*********@netbird" AUTH_CLIENT_SECRET: AUTH_AUTHORITY: "https://login.example.com" USE_AUTH0: "false" AUTH_SUPPORTED_SCOPES: "openid profile email offline_access api" AUTH_REDIRECT_URI: "/auth" AUTH_SILENT_REDIRECT_URI: "/silent-auth" NETBIRD_TOKEN_SOURCE: "accessToken" --- apiVersion: apps/v1 kind: Deployment metadata: name: netbird-signal labels: app: netbird-signal app.kubernetes.io/name: netbird-signal spec: selector: matchLabels: app: netbird-signal replicas: 1 template: metadata: labels: app: netbird-signal spec: containers: - name: signal image: docker.io/netbirdio/signal:latest imagePullPolicy: "Always" ports: - containerPort: 80 name: signal-http securityContext: runAsUser: 1000 runAsGroup: 1000 args: ["--log-file", "console"] volumeMounts: - name: data mountPath: /var/lib/netbird enableServiceLinks: false volumes: - name: data hostPath: path: /srv/containers/netbird/signal type: DirectoryOrCreate --- apiVersion: apps/v1 kind: Deployment metadata: name: netbird-coturn labels: app: netbird-coturn app.kubernetes.io/name: netbird-coturn spec: selector: matchLabels: app: netbird-coturn replicas: 1 strategy: type: Recreate template: metadata: labels: app: netbird-coturn spec: hostNetwork: true containers: - name: coturn image: docker.io/coturn/coturn imagePullPolicy: "Always" securityContext: runAsUser: 1000 runAsGroup: 1000 args: - "-c" - "/etc/turnserver.conf" - "--log-file=stdout" volumeMounts: - name: conf mountPath: /etc/turnserver.conf readOnly: true enableServiceLinks: false volumes: - name: conf hostPath: path: /srv/containers/netbird/turnserver.conf type: FileOrCreate --- apiVersion: v1 kind: Service metadata: name: netbird spec: selector: app: netbird ports: - name: front-http port: 80 targetPort: front-http protocol: TCP - name: back-http port: 8080 targetPort: back-http protocol: TCP type: ClusterIP --- apiVersion: v1 kind: Service metadata: name: netbird-signal spec: selector: app: netbird-signal ports: - name: signal-http port: 80 targetPort: signal-http protocol: TCP type: ClusterIP --- apiVersion: traefik.io/v1alpha1 kind: IngressRoute metadata: name: netbird labels: app: netbird spec: entryPoints: - websecure routes: - match: Host(`vpn.example.com`) && PathPrefix(`/`) kind: Rule middlewares: - name: hsts-header services: - name: netbird port: front-http scheme: http - match: Host(`vpn.example.com`) && PathPrefix(`/signalexchange.SignalExchange/`) kind: Rule middlewares: - name: hsts-header services: - name: netbird-signal port: signal-http scheme: h2c - match: Host(`vpn.example.com`) && PathPrefix(`/api`) kind: Rule middlewares: - name: hsts-header services: - name: netbird port: back-http scheme: http - match: Host(`vpn.example.com`) && PathPrefix(`/management.ManagementService/`) kind: Rule middlewares: - name: hsts-header services: - name: netbird port: back-http scheme: h2c tls: secretName: vpn.example.com ```
axlroden commented 8 months ago

I assume an operator would be needed to do any kind of HA easily..

KlavsKlavsen commented 8 months ago

operator won't work - as you'll need a pod on each node in cluster - if you want to connect it to wireguard vpn. So either CNI extension - or daemonset I'd say.

gecube commented 6 months ago

Hi! I would be also happy to get some k8s native way of installation. Everybody will benefit from it. What is it important to me? Because we don't want to have a dedicated EC2 instances, but rather - put all compute into large k8s and get unified approach for infra management and yes, in that case I will be able to select a particular node to netbird project with some static IP if necessary.

marcportabellaclotet-mt commented 4 months ago

I was able to make the whole setup work in kubernetes in HA. I am using these helm charts. Instead of using coturn, I am using stunner, which works very well, and it is build to run on kubernetes. My setup is in EKS, and for stunner it only needs to export 443 udp port..

braginini commented 4 months ago

That’s cool, @marcportabellaclotet-mt !

How did you handle the storage part, in a few words :)

Zaunei commented 4 months ago

I have a similar setup in operation (also with stunner) and I would also like to help here to simplify the operation under Kubernetes if desired.

What would make the operation in Kubernetes much easier would be if the managment.json file would also work read only, which is currently not the case.

Then, with managment.json read-only, you could mount Kubernetes Secrets as a volume, which would eliminate the need to provision the file before running the container, and changes to the file could more easily trigger a container restart. Currently, you need some sort of provisioning with an initContainer like @marcportabellaclotet-mt did it with vals.

mlsmaycon commented 4 months ago

Hello @Zaunei you can avoid the management.json rewriting by generating the key with openssl:

openssl rand -base64 32

then you can just add it to the management.json with:

"DataStoreEncryptionKey": "NEWKEY",
marcportabellaclotet-mt commented 4 months ago

I agree @Zaunei , having the management.json file static will help to make the setup easier.

Also it would be great that the config for postgres database is not managed via env vars, because it creates 2 sources of configuration, env vars and management.json file.

I think it would be great to add a new key in management.json

      "StoreConfig": {
        "Engine": "postgres",
        "DSN": "xxxx"
      }
KlavsKlavsen commented 4 months ago

That’s cool, @marcportabellaclotet-mt !

How did you handle the storage part, in a few words :)

just jumping in here.. Not quite clear what you're asking (maybe with some context not in issue - or that I'm missing) - but chart has a management pvc (storage) - that it mounts under /var/lib/netbird - https://github.com/totmicro/helms/blob/7b1074adcb9844feb458834133e586dc1bee83b6/charts/netbird/templates/management-deployment.yaml#L105

marcportabellaclotet-mt commented 4 months ago

As @KlavsKlavsen points, if you would like to use local db, then you need to enable the PVC. If you want to use postgre store, then you do not need the enabled the PVC, and just configure the dsn via environment variables. As I pointed in my previous comment, it would be great to define the dsn in the management,json file to avoid double config source.

marcportabellaclotet-mt commented 4 months ago

I have a similar setup in operation (also with stunner) and I would also like to help here to simplify the operation under Kubernetes if desired.

What would make the operation in Kubernetes much easier would be if the managment.json file would also work read only, which is currently not the case.

Then, with managment.json read-only, you could mount Kubernetes Secrets as a volume, which would eliminate the need to provision the file before running the container, and changes to the file could more easily trigger a container restart. Currently, you need some sort of provisioning with an initContainer like @marcportabellaclotet-mt did it with vals.

One advantage of using vals to render the image is that it eliminates the need to handle the entire management.json file as a Kubernetes secret, simplifying configuration changes. With vals, the management.json file can be securely stored on GitHub, making it easier to review changes.

KlavsKlavsen commented 4 months ago

I would really recommend to use a pg operator.. I plan to add above chart to our Kubeaid (open source) project - to make netbird easier to "just setup" (for ourselves as well :) We usually extend charts with a template for operator managed postgresql - greatly simplyfing backup, HA setups etc. - like this: https://github.com/Obmondo/kubeaid/blob/master/argocd-helm-charts/keycloakx/templates/postgresql-cnpg.yaml adding that as an option in the chart would be a good idea. We'll gladly submit a PR for it (as we try to do for all improvements we do to other open source projects).

marcportabellaclotet-mt commented 4 months ago

My netbird chart have some little issues with grpc, which I will fix this week. It also lacks proper documentation, as I write it just for testing the setup. I would like that netbird maintainers accept a PR on its repo to add an "official" helm chart, but if this is not possible, I am happy that you improve my basic chart, and add xtra options and host it in your kubeaid project.

KlavsKlavsen commented 4 months ago

@marcportabellaclotet-mt no - you're misunderstanding. we don't host charts there.. we only mirror them. it includes a script file to pull latest charts for all projects automaticly.. the point is supply chain security. To have that - you need to be able to review changes in YOUR operations environment. So doing this - we simply pull updated charts from upstream often and review the changes for any security / odd changes.. before we merge .. and users can see the same (as the project releases is simply this git repo - of which user has their own fork..

We use the "charts umbrella pattern" - so we can extend charts (adding support for firewall policy, operators etc. - and then we can work on upstream them if upstream project wants the improvement).

I've spent 20+ years building operational setups for large companies.. and they're 90-95% the exact same setup.. but no one collaborates - so every time - my work could not be re-used - but I had to start over with next customer (as a consultant). I got tired of that - so invented KubeAid (and LinuxAid for same with Puppet and Linux servers) - so we can share the 90% + work we do for customers as open source, which benefits everyone - and enables us to cost share improvements.. win win - and a much more fun job that way :)

We had used OCI repo for chart caching/mirroring.. but that did not provide a good way to diff.. so for now - this was the easier way :)

axlroden commented 4 months ago

I would really recommend to use a pg operator.. I plan to add above chart to our Kubeaid (open source) project - to make netbird easier to "just setup" (for ourselves as well :) We usually extend charts with a template for operator managed postgresql - greatly simplyfing backup, HA setups etc. - like this: https://github.com/Obmondo/kubeaid/blob/master/argocd-helm-charts/keycloakx/templates/postgresql-cnpg.yaml adding that as an option in the chart would be a good idea. We'll gladly submit a PR for it (as we try to do for all improvements we do to other open source projects).

just like stunner shouldn't be part of a netbird chart.. why would a pg operator be ?

KlavsKlavsen commented 4 months ago

just like stunner shouldn't be part of a netbird chart.. why would a pg operator be ?

I never said it should be. But you COULD add a template like the one I linked to - and add option to simply "use cnpg operator for postgresql" f.ex. - and make it even easier to centralize management of such instances that really benefit from using an operator :)

Same for stunner - it could be added as an option .. afterall helm charts is just about making things userfriendly.. thats what packages does.. provides options in a userfriendly way..

jan-schnurpfeil-gcx commented 4 months ago

you can avoid the management.json rewriting by generating the key with openssl

Works like a charm, this makes things much easier, thanks @mlsmaycon!

My netbird chart have some little issues with grpc

I have a dedicated ingress-nginx controller running for netbird with higher timeouts, which unfortunately cannot be set directly on the Ingress mainfest. @marcportabellaclotet-mt

Snippet from my ingress-nginx values.yaml:

controller:
  config:
    client-header-timeout: 1d
    client-body-timeout: 1d
marcportabellaclotet-mt commented 4 months ago

My netbird chart have some little issues with grpc

I have a dedicated ingress-nginx controller running for netbird with higher timeouts, which unfortunately cannot be set directly on the Ingress mainfest. @marcportabellaclotet-mt

Snippet from my ingress-nginx values.yaml:

controller:
  config:
    client-header-timeout: 1d
    client-body-timeout: 1d

Yesterday fixed my grpc issues by adding a specific service for grpc. I have updated the chart, and now it works without issues.

dfry commented 4 months ago

thanks for sharing the charts @marcportabellaclotet-mt , do you have an example showing the configuration of stunner as well for your setup? I am assuming you are using a helmfile for everything?

marcportabellaclotet-mt commented 4 months ago

apiVersion: stunner.l7mp.io/v1 kind: GatewayConfig metadata: name: stunner-gatewayconfig namespace: stunner-system spec: realm: stunner.l7mp.io authRef: name: stunner-auth-secret namespace: stunner-system

apiVersion: gateway.networking.k8s.io/v1 kind: Gateway metadata: name: udp-gateway namespace: stunner-system annotations: service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing" service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: "ip" service.beta.kubernetes.io/aws-load-balancer-type: "nlb" service.beta.kubernetes.io/aws-load-balancer-subnets: "subnet-xxxx,,subnet-yyyyy,subnet-zzzzz" service.beta.kubernetes.io/aws-load-balancer-internal: "false" service.beta.kubernetes.io/aws-load-balancer-healthcheck-path: "/live" service.beta.kubernetes.io/aws-load-balancer-healthcheck-port: "8086" service.beta.kubernetes.io/aws-load-balancer-healthcheck-protocol: "HTTP" spec: gatewayClassName: stunner-gatewayclass listeners:

dfry commented 4 months ago

Thanks again @marcportabellaclotet-mt

I am trying to prototype a solution with zitadel and netbird. I have reverse engineered most of the configuration of zitadel from the netbird script (https://github.com/netbirdio/netbird/releases/latest/download/getting-started-with-zitadel.sh) and now I am trying to get a sensible deployment in k8s of netbird. Happy to add some docs to your chart for zitadel integration if it makes sense. Here is what i have so far, again, just POC refactoring that is introducing netbird and zitadel:

https://github.com/mojaloop/iac-ansible-collection-roles/tree/feaature/cc-k8s-deploy/mojaloop/iac/roles/cc_k8s https://github.com/mojaloop/iac-modules/tree/feature/cc-k8s

Also, since I am manually creating my NLBs, one for internal, one for external traffic and I am not using EKS at the moment (but I will support EKS as an deployment option), I want to configure the gateway for stunner to listen on a nodeport instead of using the loadbalancer option. It isn't clear how to set the nodeport service option without it trying to use a loadbalancer service as the default. Any hints appreciated.

dfry commented 4 months ago

ok, I have incorportated your charts, the integration with zitadel and using stunner with nodeports and it is working great. thanks for the charts and the pointers. @marcportabellaclotet-mt

Now if we could just get a terraform provider for this netbird API. Anybody using some automation tools for creating networks and setup keys idempotently after installation?

sdaberdaku commented 2 months ago

Hello all, I was also able to deploy stunner on EKS behind NLB (I also deployed the dataplane as a daemonset as explained here) and was wondering how I should handle the relay UDP ports (49152-65535) mentioned here.

sdaberdaku commented 2 months ago

Another question: was any of you guys able to run the NetBird components (management, signal, dashboard) as non-root?