jenkins-infra / helpdesk

Open your Infrastructure related issues here for the Jenkins project
https://github.com/jenkins-infra/helpdesk/issues/new/choose
16 stars 10 forks source link

Migration left over from publicK8s to arm64 #3837

Open smerle33 opened 9 months ago

smerle33 commented 9 months ago

Service(s)

Azure

Summary

following https://github.com/jenkins-infra/helpdesk/issues/3619 we still have :

Reproduction steps

No response

dduportal commented 7 months ago

Starting with LDAP:

dduportal commented 7 months ago

Starting with LDAP:

* The image seems old and should be updated to meet our usual "update/release" system

Todo:

Then:

* Then we should be able to build it for arm64
dduportal commented 7 months ago

Update:

dduportal commented 7 months ago

Important note: these migrations are triggering SNAT exhaustion problem, which we though was fixed earlier this week. We have to suspend tentatives until we've fixed it (even temporarily): https://github.com/jenkins-infra/helpdesk/issues/3908

dduportal commented 2 months ago

Update: let's start with ACP migration, using the same technique as https://github.com/jenkins-infra/helpdesk/issues/4044 for data migration

dduportal commented 2 months ago

Update on ACP (artifact-caching-proxy)

Proposed migration plan:


Pod manifests to migrate data using rsync:

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: data-artifact-caching-proxy-0
  namespace: artifact-caching-proxy
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: "32Gi"
  storageClassName: statically-provisionned
  volumeName: repo-azure-jenkins-io-pv-0
---
apiVersion: v1
kind: Pod
metadata:
  name: migrate-volume-0
  namespace: artifact-caching-proxy
spec:
  securityContext:
    runAsUser: 0
    runAsGroup: 0
    fsGroup: 0
  affinity:
    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: statefulset.kubernetes.io/pod-name
            operator: In
            values:
            - artifact-caching-proxy-0
        topologyKey: kubernetes.io/hostname
  containers:
  - image: jenkinsciinfra/packaging:latest
    name: migrate-volume-script
    command: ["rsync"]
    args: ["-v", "-a", "--delete", "/src-0/", "/dest-0/"]
    volumeMounts:
    - mountPath: /src-0
      name: old
      readOnly: true
    - mountPath: /dest-0
      name: new
  nodeSelector:
    kubernetes.io/arch: amd64
  restartPolicy: Never
  volumes:
  - name: old
    persistentVolumeClaim:
      claimName: nginx-cache-artifact-caching-proxy-0
  - name: new
    persistentVolumeClaim:
      claimName: data-artifact-caching-proxy-0
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: data-artifact-caching-proxy-1
  namespace: artifact-caching-proxy
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: "32Gi"
  storageClassName: statically-provisionned
  volumeName: repo-azure-jenkins-io-pv-1
---
apiVersion: v1
kind: Pod
metadata:
  name: migrate-volume-1
  namespace: artifact-caching-proxy
spec:
  securityContext:
    runAsUser: 0
    runAsGroup: 0
    fsGroup: 0
  containers:
  - image: jenkinsciinfra/packaging:latest
    name: migrate-volume-script
    command: ["rsync"]
    args: ["-v", "-a", "--delete", "/src-1/", "/dest-1/"]
    volumeMounts:
    - mountPath: /src-1
      name: old
      readOnly: true
    - mountPath: /dest-1
      name: new
  nodeSelector:
    kubernetes.io/arch: amd64
  affinity:
    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: statefulset.kubernetes.io/pod-name
            operator: In
            values:
            - artifact-caching-proxy-1
        topologyKey: kubernetes.io/hostname
  restartPolicy: Never
  volumes:
  - name: old
    persistentVolumeClaim:
      claimName: nginx-cache-artifact-caching-proxy-1
  - name: new
    persistentVolumeClaim:
      claimName: data-artifact-caching-proxy-1
dduportal commented 2 months ago

Update on LDAP

Proposed migration plan:


Pod manifests to migrate data using rsync:

---
apiVersion: v1
kind: Pod
metadata:
  name: migrate-volume
  namespace: ldap
spec:
  securityContext:
    runAsUser: 0
    runAsGroup: 0
    fsGroup: 0
  affinity:
    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: statefulset.kubernetes.io/pod-name
            operator: In
            values:
            - ldap-0
        topologyKey: kubernetes.io/hostname
  containers:
  - image: jenkinsciinfra/packaging:latest
    name: migrate-volume-script
    command: ["rsync"]
    args: ["-v", "-a", "--delete", "/src/", "/dest/"]
    volumeMounts:
    - mountPath: /src
      name: old
      readOnly: true
    - mountPath: /dest
      name: new
  nodeSelector:
    kubernetes.io/arch: amd64
  restartPolicy: Never
  volumes:
  - name: old
    persistentVolumeClaim:
      claimName: ldap-data
  - name: new
    persistentVolumeClaim:
      claimName: ldap-jenkins-io-data
dduportal commented 2 months ago

Update:

dduportal commented 2 months ago

Update:

dduportal commented 2 months ago

Update:

dduportal commented 2 months ago

Update:

We see a visible cost decrease thanks to:

Capture d’écran 2024-07-08 à 15 21 54

Capture d’écran 2024-07-08 à 15 25 24


image

dduportal commented 2 months ago

Update:

smerle33 commented 1 month ago

Update:

* LDAP is now starting with `arm64`. But authentications have weird behaviors: I'm not seen as "Admin" in accountapp anymore and ci.jenkins.io auth. says "auth error". No logs difference on LDAP side between x86 and `arm64` except the message `mdb_equality_candidates: (member) not indexed` only present on x86 (when it is working).

seems to say that the indexing is happening on x86 and not on arm64 (https://github.com/cveda/cveda_databank/issues/1)

for reminder: we got a mock ldap in our repositories : https://github.com/jenkins-infra/mock-ldap

I did try to launch the ldap 1.1.1 on my ARM M1 machine and got :

Status: Downloaded newer image for jenkinsciinfra/ldap:1.1.1
qemu-x86_64: Could not open '/lib64/ld-linux-x86-64.so.2': No such file or directory
dduportal commented 3 weeks ago

Update:

dduportal commented 2 weeks ago

Update: We are the proud owners of an arm64 image of mirrorbits \o/ Let's move mirrorbits to arm64!

dduportal commented 2 weeks ago

Update: We are the proud owners of an arm64 image of mirrorbits \o/ Let's move mirrorbits to arm64!

Update: