[BUG] Use PVC as cache path, The Second Mount Pod stuck in Init:0/1 state

chenmiao1991 commented 6 months ago

What happened:

when I try use-pvc-as-cache-path, the second mount pod can not Running.

What you expected to happen:

Each JuiceFS app mount pod with its own RBD cache block and is currently running.

How to reproduce it (as minimally and precisely as possible):

I create a rbd pvc use this yaml

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: juicefs-pv-rbd
  namespace: kube-system
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 512Mi
  storageClassName: ceph-rbd-pool         <- pvc create with rbd storage class

then create juicefs apps use that pvc as cache

apiVersion: v1
kind: Secret
metadata:
  name: juicefs-secret
type: Opaque
stringData:
  name: xx
  metaurl: redis://xx
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: juicefs-pv
  labels:
    juicefs-name: ten-pb-fs
spec:
  capacity:
    storage: 10Pi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  csi:
    driver: csi.juicefs.com
    volumeHandle: juicefs-pv
    fsType: juicefs
    nodePublishSecretRef:
      name: juicefs-secret
      namespace: default
    volumeAttributes:
      juicefs/mount-cache-pvc: "juicefs-pv-rbd"     <- Used here
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: juicefs-pvc
  namespace: default
spec:
  accessModes:
    - ReadWriteMany
  volumeMode: Filesystem
  storageClassName: ""
  resources:
    requests:
      storage: 10Pi
  selector:
    matchLabels:
      juicefs-name: ten-pb-fs
---
apiVersion: v1
kind: Pod
metadata:
  name: juicefs-app
  namespace: default
spec:
  containers:
  - args:
    - -c
    - while true; do sleep 5; done
    command:
    - /bin/sh
    image: centos
    name: app
    volumeMounts:
    - mountPath: /data
      name: data
    resources:
      requests:
        cpu: 10m
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: juicefs-pvc      <- share juicefs pvc
---
apiVersion: v1
kind: Pod
metadata:
  name: juicefs-app2
  namespace: default
spec:
  containers:
  - args:
    - -c
    - while true; do sleep 5; done
    command:
    - /bin/sh
    image: centos
    name: app
    volumeMounts:
    - mountPath: /data
      name: data
    resources:
      requests:
        cpu: 10m
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: juicefs-pvc        <- share juicefs pvc

the second app mount pod is stuck in a Init:0/1 state with the Warning:

Events:
  Type     Reason       Age                  From     Message
  ----     ------       ----                 ----     -------
  Warning  FailedAttachVolume  2m27s  attachdetach-controller  Multi-Attach error for volume "pvc-d744c8d6-145b-4006-9ba8-bf42fd4ad632" Volume is already used by pod(s) node-12-juicefs-pv-crxnpz
  Warning  FailedMount         24s    kubelet                  Unable to attach or mount volumes: unmounted volumes=[cachedir-pvc-0], unattached volumes=[jfs-root-dir kube-api-access-zh5j2 cachedir-pvc-0 jfs-dir updatedb]: timed out waiting for the condition

Anything else we need to know?

How to implement each JuiceFS app mount pod with its own RBD cache block？
Any suggestions ？

Environment:

JuiceFS CSI Driver version (which image tag did your CSI Driver use): v0.18.1
Kubernetes version (e.g. kubectl version): v1.23.13
Object storage (cloud provider and region): ceph 14
Metadata engine info (version, cloud provider managed or self maintained): self maintained。
Network connectivity (JuiceFS to metadata engine, JuiceFS to object storage):
Others:

showjason commented 6 months ago

juicefs-app1 and juicefs-app2 run on the same node or different nodes? If different nodes, it's very likely that the issue caused by the accessModes RWO.

chenmiao1991 commented 6 months ago

juicefs-app1 and juicefs-app2 run on the same node or different nodes? If different nodes, it's very likely that the issue caused by the accessModes RWO.

@showjason run on the different nodes. How to solve the use of block device PVC, I see examples are all cloud vendor block devices.

showjason commented 6 months ago

@chenmiao1991 as I know, ceph rbd doesn't support RWX, but support ROX. Maybe, dedicated-cache-cluster is one way to address your issue. Or you can use the NFS instead of block storage. @zxh326 sorry, do you have any better ideas?

chenmiao1991 commented 6 months ago

@showjason Using the juicefs distributed file system is to replace file systems like nfs, which brings us back again.

@showjason @zxh326 may juicefs-csi-node DaemonSet can automatically mount their own rbd as cache, and other applications can share the rbd cache. Do not use the hostpath mode, as it is inconvenient for batch creation and deletion of rbd.

juicedata / juicefs-csi-driver

[BUG] Use PVC as cache path, The Second Mount Pod stuck in Init:0/1 state #906