dragonflyoss / Dragonfly2

Dragonfly is an open source P2P-based file distribution and image acceleration system. It is hosted by the Cloud Native Computing Foundation (CNCF) as an Incubating Level Project.
https://d7y.io
Apache License 2.0
2.09k stars 263 forks source link

install failed #3348

Open xiaoyigood opened 5 days ago

xiaoyigood commented 5 days ago

Bug report:

[root@master01 dragonfly]# kubectl describe pod dragonfly-dfdaemon-8z26j -n dragonfly-system Name: dragonfly-dfdaemon-8z26j Namespace: dragonfly-system Priority: 0 Node: master01/10.200.88.41 Start Time: Thu, 27 Jun 2024 20:05:25 +0800 Labels: app=dragonfly component=dfdaemon controller-revision-hash=5cf9c8f8bc pod-template-generation=1 release=dragonfly Annotations: checksum/config: 0785ea79979d2c5b3b25b6d0a83b8ff4abd13b78f6d96019c5a670686be5628b cni.projectcalico.org/containerID: 4fd3501d10ab952bccc1e9225d0af5043b30cf9a1f0cc2c0825c28d6a3404fdb cni.projectcalico.org/podIP: 192.168.241.114/32 cni.projectcalico.org/podIPs: 192.168.241.114/32 Status: Running IP: 192.168.241.114 IPs: IP: 192.168.241.114 Controlled By: DaemonSet/dragonfly-dfdaemon Init Containers: wait-for-scheduler: Container ID: docker://b4f87211ae9e94f0293d97c09cc2386876fb8dc0b3b5aeb341e12fce4a683dfd Image: docker.io/busybox:latest Image ID: docker-pullable://busybox@sha256:5eef5ed34e1e1ff0a4ae850395cbf665c4de6b4b83a32a0bc7bcb998e24e7bbb Port: Host Port: Command: sh -c until nslookup dragonfly-scheduler.dragonfly-system.svc.cluster.local && nc -vz dragonfly-scheduler.dragonfly-system.svc.cluster.local 8002; do echo waiting for scheduler; sleep 2; done; State: Terminated Reason: Completed Exit Code: 0 Started: Thu, 27 Jun 2024 20:17:22 +0800 Finished: Thu, 27 Jun 2024 20:17:24 +0800 Ready: True Restart Count: 1 Environment: Mounts: /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-rwv78 (ro) mount-netns: Container ID: docker://4ec21ce12c85d20412d2810e469c19e4be93acc7a76078e9181b096d1809d7e4 Image: docker.io/dragonflyoss/dfdaemon:v2.1.44 Image ID: docker-pullable://dragonflyoss/dfdaemon@sha256:b8bded624cf4664ea39cee3188c2e05b9d1acc5cfdcf2bac93a514caf26c84d6 Port: Host Port: Command: /bin/sh -cx if [ ! -e "/run/dragonfly/net" ]; then touch /run/dragonfly/net fi i1=$(stat -L -c %i /host/ns/net) i2=$(stat -L -c %i /run/dragonfly/net) if [ "$i1" != "$i2" ]; then /bin/mount -o bind /host/ns/net /run/dragonfly/net fi State: Terminated Reason: Completed Exit Code: 0 Started: Thu, 27 Jun 2024 20:17:25 +0800 Finished: Thu, 27 Jun 2024 20:17:25 +0800 Ready: True Restart Count: 0 Limits: cpu: 2 memory: 2Gi Requests: cpu: 0 memory: 0 Environment: Mounts: /host/ns from hostns (rw) /run/dragonfly from run (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-rwv78 (ro) update-docker-config: Container ID: docker://87f6e1587c9bd37f7beb85f4cfcb1890d36a7555d52c989df2b8685455139e01 Image: docker.io/dragonflyoss/openssl:latest Image ID: docker-pullable://dragonflyoss/openssl@sha256:af3e89fefa995d51bcad04cf97532062d011bd2457403d6448b665acde352913 Port: Host Port: Command: /bin/sh -cx mkdir -p /tmp/dragonfly-ca cd /tmp/dragonfly-ca

  openssl genrsa -out cakey.pem 2048

  cat << EOF > root.conf
  [ req ]
  default_bits        = 2048
  default_keyfile     = key.pem
  default_md          = sha256
  distinguished_name  = req_distinguished_name
  req_extensions      = req_ext
  string_mask         = nombstr
  x509_extensions     = x509_ext
  [ req_distinguished_name ]
  countryName                 = Country Name (2 letter code)
  countryName_default         = CN
  stateOrProvinceName         = State or Province Name (full name)
  stateOrProvinceName_default = Hangzhou
  localityName                = Locality Name (eg, city)
  localityName_default        = Hangzhou
  organizationName            = Organization Name (eg, company)
  organizationName_default    = Dragonfly
  commonName                  = Common Name (e.g. server FQDN or YOUR name)
  commonName_max              = 64
  commonName_default          = Dragonfly Authority CA
  [ x509_ext ]
  authorityKeyIdentifier = keyid,issuer
  basicConstraints       = CA:TRUE
  keyUsage               = digitalSignature, keyEncipherment, keyCertSign, cRLSign
  subjectKeyIdentifier   = hash
  [ req_ext ]
  basicConstraints     = CA:TRUE
  keyUsage             = digitalSignature, keyEncipherment, keyCertSign, cRLSign
  subjectKeyIdentifier = hash
  EOF

  openssl req -batch -new -x509 -key ./cakey.pem -out ./cacert.pem -days 65536 -config ./root.conf
  openssl x509 -inform PEM -in ./cacert.pem -outform DER -out ./CA.cer

  openssl x509 -in ./cacert.pem -noout -text
  # update ca for golang program(docker in host), refer: https://github.com/golang/go/blob/go1.17/src/crypto/x509/root_linux.go#L8
  ca_list="/etc/ssl/certs/ca-certificates.crt /etc/pki/tls/certs/ca-bundle.crt /etc/ssl/ca-bundle.pem /etc/pki/tls/cacert.pem /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem /etc/ssl/cert.pem"
  for ca in $ca_list; do
    ca="/host$ca"
    if [[ -e "$ca" ]]; then
      echo "CA $ca" found
      if grep "Dragonfly Authority CA" "$ca"; then
        echo "Dragonfly Authority ca found"
        if [[ -e /host/etc/dragonfly-ca/cakey.pem && -e /host/etc/dragonfly-ca/cacert.pem ]]; then
          echo "CA cert and key ready"
          break
        else
          echo "Warning: CA cert and key not ready"
        fi
      fi
      echo "Try to add Dragonfly CA"
      echo "# Dragonfly Authority CA" > cacert.toadd.pem
      cat cacert.pem >> cacert.toadd.pem
      cat cacert.toadd.pem >> "$ca"
      echo "Dragonfly CA added"
      cp -f ./cakey.pem ./cacert.pem /host/etc/dragonfly-ca/
      break
    fi
  done
  domains="10.200.88.53"
  if [[ -n "$domains" ]]; then
    for domain in $domains; do
      # inject docker cert by registry domain
      dir=/host/etc/docker/certs.d/$domain
      mkdir -p "$dir"
      echo copy CA cert to $dir
      cp -f /host/etc/dragonfly-ca/cacert.pem "$dir/ca.crt"
    done
  fi
State:          Terminated
  Reason:       Completed
  Exit Code:    0
  Started:      Thu, 27 Jun 2024 20:17:26 +0800
  Finished:     Thu, 27 Jun 2024 20:17:26 +0800
Ready:          True
Restart Count:  0
Limits:
  cpu:     2
  memory:  2Gi
Requests:
  cpu:        0
  memory:     0
Environment:  <none>
Mounts:
  /host/etc from etc (rw)
  /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-rwv78 (ro)

Containers: dfdaemon: Container ID: docker://ca8cfb7df43e3952597734c82b1bb8c5e7256a5604bad4f5ad896bb43f012409 Image: docker.io/dragonflyoss/dfdaemon:v2.1.44 Image ID: docker-pullable://dragonflyoss/dfdaemon@sha256:b8bded624cf4664ea39cee3188c2e05b9d1acc5cfdcf2bac93a514caf26c84d6 Ports: 65001/TCP, 40901/TCP Host Ports: 0/TCP, 40901/TCP State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Exit Code: 1 Started: Thu, 27 Jun 2024 20:21:30 +0800 Finished: Thu, 27 Jun 2024 20:21:30 +0800 Ready: False Restart Count: 8 Limits: cpu: 2 memory: 2Gi Requests: cpu: 0 memory: 0 Liveness: exec [/bin/grpc_health_probe -addr=:65000] delay=15s timeout=1s period=10s #success=1 #failure=3 Readiness: exec [/bin/grpc_health_probe -addr=:65000] delay=5s timeout=1s period=10s #success=1 #failure=3 Environment: Mounts: /etc/dragonfly from config (rw) /etc/dragonfly-ca from d7y-ca (rw) /host/etc from etc (rw) /run/dragonfly from run (rw) /var/lib/dragonfly from data (rw) /var/log/dragonfly/daemon from logs (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-rwv78 (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: config: Type: ConfigMap (a volume populated by a ConfigMap) Name: dragonfly-dfdaemon Optional: false hostns: Type: HostPath (bare host directory volume) Path: /proc/1/ns HostPathType:
run: Type: HostPath (bare host directory volume) Path: /run/dragonfly HostPathType: DirectoryOrCreate etc: Type: HostPath (bare host directory volume) Path: /etc HostPathType:
d7y-ca: Type: HostPath (bare host directory volume) Path: /etc/dragonfly-ca HostPathType: DirectoryOrCreate data: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium:
SizeLimit: logs: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium:
SizeLimit: kube-api-access-rwv78: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true QoS Class: Burstable Node-Selectors: Tolerations: node.kubernetes.io/disk-pressure:NoSchedule op=Exists node.kubernetes.io/memory-pressure:NoSchedule op=Exists node.kubernetes.io/not-ready:NoExecute op=Exists node.kubernetes.io/pid-pressure:NoSchedule op=Exists node.kubernetes.io/unreachable:NoExecute op=Exists node.kubernetes.io/unschedulable:NoSchedule op=Exists Events: Type Reason Age From Message


Normal Scheduled 17m default-scheduler Successfully assigned dragonfly-system/dragonfly-dfdaemon-8z26j to master01 Normal Created 17m kubelet Created container wait-for-scheduler Normal Started 17m kubelet Started container wait-for-scheduler Normal Pulled 17m kubelet Container image "docker.io/busybox:latest" already present on machine Normal Pulled 17m kubelet Container image "docker.io/dragonflyoss/dfdaemon:v2.1.44" already present on machine Normal Created 17m kubelet Created container mount-netns Normal Created 17m kubelet Created container update-docker-config Normal Started 17m kubelet Started container mount-netns Normal Pulled 17m kubelet Container image "docker.io/dragonflyoss/openssl:latest" already present on machine Normal Started 17m kubelet Started container update-docker-config Normal Pulled 17m (x3 over 17m) kubelet Container image "docker.io/dragonflyoss/dfdaemon:v2.1.44" already present on machine Normal Created 17m (x3 over 17m) kubelet Created container dfdaemon Normal Started 17m (x3 over 17m) kubelet Started container dfdaemon Warning BackOff 2m45s (x82 over 17m) kubelet Back-off restarting failed container

Expected behavior:

How to reproduce it:

[root@master01 dragonfly]# kubectl get pod -n dragonfly-system -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES dragonfly-dfdaemon-8z26j 0/1 CrashLoopBackOff 9 (3m21s ago) 24m 192.168.241.114 master01 dragonfly-dfdaemon-xbt8v 0/1 CrashLoopBackOff 18 (53s ago) 74m 192.168.59.234 master02 dragonfly-dfdaemon-zhgxn 0/1 CrashLoopBackOff 18 (48s ago) 74m 192.168.235.16 master03 dragonfly-jaeger-84dbfd5b56-zsqvx 1/1 Running 0 74m 192.168.241.125 master01 dragonfly-manager-84779bd49-5vp4x 1/1 Running 0 74m 192.168.235.23 master03 dragonfly-manager-84779bd49-8bd4j 1/1 Running 2 (69m ago) 74m 192.168.241.111 master01 dragonfly-manager-84779bd49-s72pl 1/1 Running 2 (69m ago) 74m 192.168.59.218 master02 dragonfly-mysql-0 1/1 Running 1 (71m ago) 74m 192.168.241.92 master01 dragonfly-redis-master-0 1/1 Running 0 74m 192.168.241.81 master01 dragonfly-redis-replicas-0 1/1 Running 0 74m 192.168.241.113 master01 dragonfly-redis-replicas-1 1/1 Running 0 73m 192.168.59.198 master02 dragonfly-redis-replicas-2 1/1 Running 0 73m 192.168.235.35 master03 dragonfly-scheduler-0 1/1 Running 0 74m 192.168.241.118 master01 dragonfly-scheduler-1 1/1 Running 0 68m 192.168.235.25 master03 dragonfly-scheduler-2 1/1 Running 0 67m 192.168.59.207 master02 dragonfly-seed-peer-0 1/1 Running 1 (68m ago) 74m 192.168.241.127 master01 dragonfly-seed-peer-1 1/1 Running 0 68m 192.168.235.7 master03 dragonfly-seed-peer-2 1/1 Running 0 67m 192.168.59.246 master02

can not get error log more

Environment:

gaius-qi commented 5 days ago

@xiaoyigood Please refer to https://d7y.io/docs/next/getting-started/installation/helm-charts/.