netbox-community / netbox-chart

A Helm chart for NetBox
https://netbox.readthedocs.io/
Apache License 2.0
247 stars 149 forks source link

Cloning projects from GitLab as datasource fails #322

Open tbartschat opened 3 weeks ago

tbartschat commented 3 weeks ago

Hi all,

first of all I want to thank you for maintaining this Netbox helm-chart project, this makes it very simple to deploy Netbox in a k8s based enviroment and make our live much easier, great stuff!

We are currently working on an integration of a Netbox instance for production environment. We got a lot of different requirement, one of them is that we can use GitLab as datasource in Netbox.

Here we run into some issues, maybe you can assists and support here to find a working solution.

Let me explain our setup first.

General settings

Kubernetes Version: v1.30.0 OS Version: Ubuntu 22.04.4 LTS Netbox Helm Chart Version: 5.0.0-beta.78 Netbox appVersion: v4.0.9

Netbox is running on self-hosted k8s cluster using an ingress and a self-signed cert, GitLab is running on a self-hosted virtual machine with a self-signed cert.

If I wan to clone a GitLab project as datasource in Netbox via GUI, it is failing.

1st attempt via HTTPS

We have a self-hosted GitLab instance with a self-signed certificate, if I want to clone a project it fails, because of "SSL: CERTIFICATE_VERIFY_FAILED".

Log Messages of netbox-worker pod:

SyncError("Fetching remote data failed (GitProtocolError): HTTPSConnectionPool(host='xxxxxxxx', port=443): Max retries exceeded with url: /gitlab/thdd/engineering/master_templates.git/info/refs?service=git-upload-pack (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1006)')))")

My first intension was to add extraEnvs to the netbox-worker pod, that SSL verification will skip.

I have tried it with following Envs:

  extraEnvs:
    - name: https_verify_ssl
      value: 'false'
    - name: IGNORE_SSL_ERRORS
      value: 'true'
    - name: SSL_NO_VERIFY
      value: 'true'
    - name: GIT_SSL_NO_VERIFY
      value: 'true'

But all of the extraEnvs doesn't solve the issue, cloning a GitLab Project is still not possible, logging messages is the same.

My second intension was, to let Netbox know about the self-signed GitLab cert, so I mounted the related cert into the netbox-worker pod via extraVolumeMounts and extraVolumes and set extraEnvs to use the mounted cert path for validation.

  extraEnvs:
    - name: SSL_CERT_FILE
      value: '/opt/netbox/netbox/certs/xxxxx.de.pem'
    - name: REQUESTS_CA_BUNDLE
      value: '/opt/netbox/netbox/certs/xxxxx.de.pem'

  extraVolumeMounts:
    - name: xxxxx.de.pem
      mountPath: /opt/netbox/netbox/certs/

 extraVolumes:
    - name: xxxxx.de.pem
      secret:
        secretName: gitlab-cert

But this attempt was also not successful, so I moved on to use SSH for cloning.

2nd attempt via SSH

Same setup, but different way to clone the GitLab project, this time via SSH.

But here I run into a different issue, because of host key verification is failing.

Log Messages of netbox-worker pod:

SyncError('Fetching remote data failed (HangupException): Host key verification failed.\r')

So first intension was to generate ssh key pairs inside the netbox-worker pod, after changing the security context from "readOnlyRootFilesystem: true" to "readOnlyRootFilesystem: false" I was able to generate a ssh key pair and import the public key to GitLab.

But this wasn't successful, I have the assumption that Netbox is using a different key which is generated by default during the installation/creation of the netbox-worker pod. I also have also tried to use the ssh keys of the k8s-worker nodes, because in a tcpdump on the GitLab VM I saw the ssh request for cloning was coming from the k8s-worker node IPs, but this is most likely because of the ingress settings, and it was also failing with these ssh keys.

Finally, I'm stucked again ... argh.

So my two questions are:

  1. Is there any way to skip the SSL verification for cloning a GitLab Project as datasource in Netbox ?
  2. What public ssh key is used by the netbox-worker pod and how can I access / get this key ?

Can you please support here, do you have any ideas for solving this issue ?

Thanks in advance.

RangerRick commented 3 weeks ago

I would be surprised if any particular public ssh key is pre-populated in any way. The error message you mentioned is host-key validation, so my first guess is you would need to make a ~/.ssh/known_hosts file for the netbox user (which I think is just ubuntu in the images) just like you get when manually saying "yes" the first time you connect to an ssh server. You could define a configmap with the contents, and then use worker.extraVolumes and worker.extraVolumeMounts to stick it in the right place.

Another option would be to make a worker.initContainers entry with a configmap script mounted into it that knows how to set up the environment to reach your resources, so you can do more complicated things that run before the worker starts, like so:

# (in values.yaml):
worker:
  initContainers:
    - name: init-gitlab-env
      command: ['/opt/gitlab/init.sh']
      volumes:
        - name: "gitlab-config"
          configMap:
            name: "gitlab-config"
            defaultMode: 0755
      volumeMounts:
        - name: "gitlab-config"
          mountPath: /opt/gitlab/init.sh
          subPath: init.sh
          readOnly: true
---
# (create a new file and deploy to your cluster, eg configmap-gitlab.yaml)
apiVersion: v1
kind: ConfigMap
metadata:
  name: netbox-gitlab-config
  labels: {{- include "common.labels.standard" ( dict "customLabels" .Values.commonLabels "context" $ ) | nindent 4 }}
  {{- if .Values.netbox.commonAnnotations }}
  annotations: {{- include "common.tplvalues.render" ( dict "value" .Values.netbox.commonAnnotations "context" $ ) | nindent 4 }}
  {{- end }}
data:
  init.sh: |-
    #!/bin/sh

    # ...do whatever you need to here
RangerRick commented 3 weeks ago

Honestly, though, the SSL/PEM thing should work I would think, but I'm not terribly well-versed on how you make (upstream) NetBox oss pull in working PEMs. For some other projects I've worked with, it was similar to what you tried. Only thing I can think of is to make sure the entire CA root bundle is actually also in that .pem file, if you're gonna set REQUESTS_CA_BUNDLE to point to it.

tbartschat commented 3 weeks ago

@RangerRick thanks for your fast support here.

I'm moving some steps forward.

GitLab clone via SSH topic

Yes you are right, the log messages was related to the host-key validation, my bad.

As you mentioned, I created a configmap for adding the host-key to the netbox-worker pod and I also generate private and public ssh keys for authentication to our GitLab.

After adding the GitLab Hostkey to the netbox-worker pod, I got an authentication error "Permission deinied" from GitLab.

  SyncError('Fetching remote data failed (HangupException): Permission denied, please try again.\r\nPermission denied, 
  please try again.\r\ngit@xxxx.de: Permission denied (publickey,password).\r') 

So I created a ssh key pair and put them also into the same configmap and mount them to the netbox-worker pod.

My Configmaps looks like:

apiVersion: v1
kind: ConfigMap
metadata:
  name: gitlab-keys
  namespace: netbox
data:
  known_hosts: '<GitLab-Hostkey>' 
  id_ed25519: '<private-ssh-key>'
  id_ed25519.pub: '<public-ssh-key>'

values.yaml (snippet)

  extraVolumes:
    - name: gitlab-keys
      configMap:
        name: gitlab-keys

  extraVolumeMounts:
    - name: gitlab-keys
      mountPath: /home/ubuntu/.ssh/

So far so good. But I got still an error messages, this time from 'libcrypto' related to the loaded key ...

SyncError('Fetching remote data failed (HangupException): Load key "/home/ubuntu/.ssh/id_ed25519": error in libcrypto\r\nPermission denied, please try again.\r\nPermission denied, please try again.\r\ngit@xxxx.de: Permission denied (publickey,password).\r')

I'm not very fimilar with libcrypto, do you have any suggestions here ?

RangerRick commented 3 weeks ago

If you're putting a private key into there, my first guess is that you will probably need to use defaultMode on the configMap to make permissions more restrictive. Honestly I'm not sure if (even without readOnly on the volume mount) you can chmod things after the fact, since even that .ssh directory needs to be something like 0700. If not, you might be able to attach a tarball instead, and then unpack it in an initContainer, so it gets good permissions...

tbartschat commented 3 weeks ago

Ok understood, following the 'initContainer' approach is the better solution. I don't have much experience with 'initContainers' but I follow your recommendations.

So I have created a configmap.yaml and have applied it to the k8s cluster.

apiVersion: v1
kind: ConfigMap
metadata:
  name: gitlab-config
  namespace: netbox
data:
  init.sh: |-
    #!/bin/sh
    mkdir /home/ubuntu/.ssh/
    touch /home/ubuntu/.ssh/known_hosts
    touch /home/ubuntu/.ssh/id_ed25519
    touch /home/ubuntu/.ssh/id_ed25519.pub
    echo "<gitlab-host-key>" >> /home/ubuntu/.ssh/known_hosts
    echo "<private-ssh-key>" >> /home/ubuntu/.ssh/id_ed25519
    echo "<public-ssh-key>" >> /home/ubuntu/.ssh/id_ed25519.pub
tbartschat@thdd-k8s-cpl-1:/thdd_platform/apps/netbox/netbox-chart$ kubectl get configmap -n netbox | grep gitlab-config
gitlab-config                1      30m
tbartschat@thdd-k8s-cpl-1:/thdd_platform/apps/netbox/netbox-chart$

Next, I'm trying to adjust the 'values.yaml', but here I run into an issue.

First, what 'initContainer' image I must use ? The below config fail without an image and something is wrong with the volumes section.

  initContainers:
    - name: init-gitlab-env
      command: ['/opt/gitlab/init.sh']
      volumes:
        - name: "gitlab-config"
          configMap:
            name: "gitlab-config"
            defaultMode: 0755
      volumeMounts:
        - name: "gitlab-config"
          mountPath: /opt/gitlab/init.sh
          subPath: init.sh
          readOnly: true
W0830 20:56:28.394446  888157 warnings.go:70] unknown field "spec.template.spec.initContainers[0].volumes"
Error: UPGRADE FAILED: cannot patch "netbox-worker" with kind Deployment: Deployment.apps "netbox-worker" is invalid: [spec.template.spec.initContainers[0].image: Required value, spec.template.spec.initContainers[0].volumeMounts[0].name: Not found: "gitlab-config"]