Alluxio / alluxio

Alluxio, data orchestration for analytics and machine learning in the cloud
https://www.alluxio.io
Apache License 2.0
6.87k stars 2.94k forks source link

Alluxio as a k8s storage class, error when read/write in the pod. #18660

Closed wkqchen20 closed 4 months ago

wkqchen20 commented 4 months ago

Alluxio Version: V2.9.3

Describe the bug

I am using alluxio as a k8s storage class, NFS as the UFS, after create a pod, it can list/touch file, but bash: echo: write error: Input/output error happened when i save data.

To Reproduce

This is the Pod which mounted alluxio, it can list/touch or rm file.

root@zsy-test-001:/data# ls -l 
total 1
drwxr-xr-x 1 root root 24 Jul 15 10:00 default_tests_files

root@zsy-test-001:/data# touch 'test.txt'
root@zsy-test-001:/data# ls -l 
total 1
drwxr-xr-x 1 root root 24 Jul 15 10:00 default_tests_files
-rw-r--r-- 1 root root  0 Jul 16 02:31 test.txt
root@zsy-test-001:/data# echo 'test' >> test.txt
bash: echo: write error: Input/output error

This is the alluxio-worker. it can write read or remove.

[alluxio@alluxio-worker-p4c6q alluxio-2.9.3]$ alluxio fs ls -h /
              0       PERSISTED 07-15-2024 09:14:39:015  DIR /.alluxio_ufs_persistence
             24       PERSISTED 07-15-2024 10:00:14:180  DIR /default_tests_files
             0B TO_BE_PERSISTED 07-16-2024 02:31:28:856 100% /test.txt
[alluxio@alluxio-worker-p4c6q alluxio-2.9.3]$ alluxio fs ls -h /
              0       PERSISTED 07-15-2024 09:14:39:015  DIR /.alluxio_ufs_persistence
             24       PERSISTED 07-15-2024 10:00:14:180  DIR /default_tests_files
             0B TO_BE_PERSISTED 07-16-2024 02:31:51:899 100% /test.txt

Auto generated pv

apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    pv.kubernetes.io/provisioned-by: alluxio
  creationTimestamp: "2024-07-16T02:21:25Z"
  finalizers:
  - kubernetes.io/pv-protection
  name: alluxio-72528345-4cc3-4ccc-bf5c-e3cd914c2eed
  resourceVersion: "6589090"
  uid: 0daeffcc-6c2e-4c85-b229-475f6648ff9b
spec:
  accessModes:
  - ReadWriteMany
  capacity:
    storage: 10Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: app-pvc
    namespace: default
    resourceVersion: "6589036"
    uid: 72528345-4cc3-4ccc-bf5c-e3cd914c2eed
  csi:
    driver: alluxio
    volumeAttributes:
      alluxioPath: /
      javaOptions: -Dalluxio.user.metadata.cache.enabled=true
      mountInPod: "false"
      storage.kubernetes.io/csiProvisionerIdentity: 1721096277296-8081-alluxio
    volumeHandle: alluxio-72528345-4cc3-4ccc-bf5c-e3cd914c2eed
  mountOptions:
  - direct_io
  - allow_other
  - entry_timeout=36000
  - attr_timeout=36000
  - max_readahead=0
  persistentVolumeReclaimPolicy: Delete
  storageClassName: alluxio
  volumeMode: Filesystem
status:
  phase: Bound

Created PVC

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    pv.kubernetes.io/bind-completed: "yes"
    pv.kubernetes.io/bound-by-controller: "yes"
    volume.beta.kubernetes.io/storage-provisioner: alluxio
    volume.kubernetes.io/storage-provisioner: alluxio
  creationTimestamp: "2024-07-16T02:21:05Z"
  finalizers:
  - kubernetes.io/pvc-protection
  labels:
    app: app-pvc
    name: app-pvc
  name: app-pvc
  namespace: default
  resourceVersion: "6589092"
  uid: 72528345-4cc3-4ccc-bf5c-e3cd914c2eed
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 10Gi
  storageClassName: alluxio
  volumeMode: Filesystem
  volumeName: alluxio-72528345-4cc3-4ccc-bf5c-e3cd914c2eed
status:
  accessModes:
  - ReadWriteMany
  capacity:
    storage: 10Gi
  phase: Bound
Pod
apiVersion: v1
kind: Pod
metadata:
  name: task-pv-pod
spec:
  hostNetwork: true
  volumes:
    - name: task-pv-storage
      persistentVolumeClaim:
        claimName: app-pvc
  containers:
    - name: task-pv-container
      image: docker.io/library/ubuntu:20.04
      command: ["sleep","100000"]
      volumeMounts:
        - mountPath: "/data"
          name: task-pv-storage

Expected behavior Should be able to read or write data in pod

Are you planning to fix it Not working on it, But i am willing to help with PR if someone point me to some direction.

jasondrogba commented 4 months ago

can you share the logs and alluixo-site.properties

wkqchen20 commented 4 months ago

can you share the logs and alluixo-site.properties

sure. i am not using alluxio-site.properties, all the configuration is in alluxio-configmap.

alluxio-configmap


apiVersion: v1
kind: ConfigMap
metadata:
  name: alluxio-config
  labels:
    name: alluxio-config
    app: alluxio
    chart: alluxio-0.6.54
    release: release-name
    heritage: Helm
data:
  ALLUXIO_JAVA_OPTS: |-
    -Dalluxio.master.hostname=alluxio-master-0 -Dalluxio.master.journal.type=EMBEDDED -Dalluxio.master.journal.folder=/journal -Dalluxio.master.mount.table.root.ufs=/data -Dalluxio.security.stale.channel.purge.interval=365d -XX:+UseContainerSupport -Dalluxio.security.authorization.permission.enabled=false
  ALLUXIO_MASTER_JAVA_OPTS: |-
    -Dalluxio.master.hostname=${ALLUXIO_MASTER_HOSTNAME} -Dalluxio.master.metastore=ROCKS -Dalluxio.master.metastore.dir=/metastore 
  ALLUXIO_JOB_MASTER_JAVA_OPTS: |-
    -Dalluxio.master.hostname=${ALLUXIO_MASTER_HOSTNAME} 
  ALLUXIO_WORKER_JAVA_OPTS: |-
    -Dalluxio.worker.hostname=${ALLUXIO_WORKER_HOSTNAME} -Dalluxio.worker.rpc.port=29999 -Dalluxio.worker.web.port=30000 -Dalluxio.worker.data.server.domain.socket.address=/opt/domain -Dalluxio.worker.data.server.domain.socket.as.uuid=true -Dalluxio.worker.container.hostname=${ALLUXIO_WORKER_CONTAINER_HOSTNAME} -Dalluxio.worker.ramdisk.size=2Gi -Dalluxio.worker.tieredstore.levels=1 -Dalluxio.worker.tieredstore.level0.dirs.mediumtype=MEM -Dalluxio.worker.tieredstore.level0.dirs.path=/dev/shm -Dalluxio.worker.tieredstore.level0.dirs.quota=1Gi 
  ALLUXIO_PROXY_JAVA_OPTS: |-

  ALLUXIO_JOB_WORKER_JAVA_OPTS: |-
    -Dalluxio.worker.hostname=${ALLUXIO_WORKER_HOSTNAME} -Dalluxio.job.worker.rpc.port=30001 -Dalluxio.job.worker.data.port=30002 -Dalluxio.job.worker.web.port=30003 
  ALLUXIO_FUSE_JAVA_OPTS: |-
    -Dalluxio.user.hostname=${ALLUXIO_CLIENT_HOSTNAME} -Dalluxio.fuse.mount.alluxio.path=/ -Dalluxio.fuse.mount.point=<nil> -Dalluxio.fuse.mount.options=allow_other -XX:MaxDirectMemorySize=2g 
  ALLUXIO_WORKER_TIEREDSTORE_LEVEL0_DIRS_PATH: /dev/shm

the mounted path /data like this

...
volumeMounts:
  - name: "nfs-data"
    mountPath: "/data"
...
volumes:
  - name: "nfs-data"
    hostPath:
      path: /mnt/nfs/
      type: DirectoryOrCreate

...

after create a pod, nodeplugin csi-nodeserver logs

I0719 03:21:57.762634       7 utils.go:97] GRPC call: /csi.v1.Node/NodeGetCapabilities
I0719 03:21:57.762661       7 utils.go:98] GRPC request: {}
I0719 03:21:57.762747       7 utils.go:103] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}}]}
I0719 03:21:57.764513       7 utils.go:97] GRPC call: /csi.v1.Node/NodeGetCapabilities
I0719 03:21:57.764531       7 utils.go:98] GRPC request: {}
I0719 03:21:57.764593       7 utils.go:103] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}}]}
I0719 03:21:57.765934       7 utils.go:97] GRPC call: /csi.v1.Node/NodeGetCapabilities
I0719 03:21:57.765955       7 utils.go:98] GRPC request: {}
I0719 03:21:57.766043       7 utils.go:103] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}}]}
I0719 03:21:57.767614       7 utils.go:97] GRPC call: /csi.v1.Node/NodeStageVolume
I0719 03:21:57.767632       7 utils.go:98] GRPC request: {"staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/alluxio/5d86871b55910dfc48022b2146ca59cedc3c9f3829873993469ec324562970ef/globalmount","volume_capability":{"AccessType":{"Mount":{"mount_flags":["direct_io","allow_other","entry_timeout=36000","attr_timeout=36000","max_readahead=0"]}},"access_mode":{"mode":5}},"volume_context":{"alluxioPath":"/","javaOptions":"-Dalluxio.user.metadata.cache.enabled=true","mountInPod":"false","storage.kubernetes.io/csiProvisionerIdentity":"1721358293111-8081-alluxio"},"volume_id":"alluxio-fdaef77f-abd6-4e46-98e8-0955f3433e6f"}
I0719 03:21:57.767927       7 utils.go:103] GRPC response: {}
I0719 03:21:57.769048       7 utils.go:97] GRPC call: /csi.v1.Node/NodeGetCapabilities
I0719 03:21:57.769085       7 utils.go:98] GRPC request: {}
I0719 03:21:57.769178       7 utils.go:103] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}}]}
I0719 03:21:57.770281       7 utils.go:97] GRPC call: /csi.v1.Node/NodeGetCapabilities
I0719 03:21:57.770299       7 utils.go:98] GRPC request: {}
I0719 03:21:57.770355       7 utils.go:103] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}}]}
I0719 03:21:57.771271       7 utils.go:97] GRPC call: /csi.v1.Node/NodeGetCapabilities
I0719 03:21:57.771295       7 utils.go:98] GRPC request: {}
I0719 03:21:57.771356       7 utils.go:103] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}}]}
I0719 03:21:57.772678       7 utils.go:97] GRPC call: /csi.v1.Node/NodePublishVolume
I0719 03:21:57.772705       7 utils.go:98] GRPC request: {"staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/alluxio/5d86871b55910dfc48022b2146ca59cedc3c9f3829873993469ec324562970ef/globalmount","target_path":"/var/lib/kubelet/pods/b441d709-86cc-4a67-b1b0-4e1568cff0a2/volumes/kubernetes.io~csi/alluxio-fdaef77f-abd6-4e46-98e8-0955f3433e6f/mount","volume_capability":{"AccessType":{"Mount":{"mount_flags":["direct_io","allow_other","entry_timeout=36000","attr_timeout=36000","max_readahead=0"]}},"access_mode":{"mode":5}},"volume_context":{"alluxioPath":"/","csi.storage.k8s.io/ephemeral":"false","csi.storage.k8s.io/pod.name":"app","csi.storage.k8s.io/pod.namespace":"lxy","csi.storage.k8s.io/pod.uid":"b441d709-86cc-4a67-b1b0-4e1568cff0a2","csi.storage.k8s.io/serviceAccount.name":"default","javaOptions":"-Dalluxio.user.metadata.cache.enabled=true","mountInPod":"false","storage.kubernetes.io/csiProvisionerIdentity":"1721358293111-8081-alluxio"},"volume_id":"alluxio-fdaef77f-abd6-4e46-98e8-0955f3433e6f"}
I0719 03:21:57.773041       7 nodeserver.go:61] Mount Alluxio to target path (pod volume path) with AlluxioFuse in CSI node server.
I0719 03:21:57.775598       7 main.go:92] Signal received: child exited
I0719 03:21:57.775777       7 nodeserver.go:108] /opt/alluxio/integration/fuse/bin/alluxio-fuse mount -o direct_io,allow_other,entry_timeout=36000,attr_timeout=36000,max_readahead=0 /var/lib/kubelet/pods/b441d709-86cc-4a67-b1b0-4e1568cff0a2/volumes/kubernetes.io~csi/alluxio-fdaef77f-abd6-4e46-98e8-0955f3433e6f/mount /
I0719 03:21:59.859522       7 nodeserver.go:110] Path /var/lib/kubelet/pods/b441d709-86cc-4a67-b1b0-4e1568cff0a2/volumes/kubernetes.io~csi/alluxio-fdaef77f-abd6-4e46-98e8-0955f3433e6f/mount is not mounted
Starting AlluxioFuse process: mounting alluxio path "/" to local mount point "/var/lib/kubelet/pods/b441d709-86cc-4a67-b1b0-4e1568cff0a2/volumes/kubernetes.io~csi/alluxio-fdaef77f-abd6-4e46-98e8-0955f3433e6f/mount"
Successfully mounted Alluxio to /var/lib/kubelet/pods/b441d709-86cc-4a67-b1b0-4e1568cff0a2/volumes/kubernetes.io~csi/alluxio-fdaef77f-abd6-4e46-98e8-0955f3433e6f/mount.
See /opt/alluxio-2.9.5/logs/fuse.log for logging messages

after touch test.log file. the alluxio-master logs


2024-07-19 03:27:24,855 WARN  [Persist-Checker-12](DefaultFileSystemMaster.java:4977) - Unexpected exception encountered when trying to complete persistence of a file /test.log (id=469762047) : java.nio.file.FileSystemException: /data/.alluxio_ufs_persistence/test.log.alluxio.1721359642828.9f3e63dc-8d55-4e0a-9bef-8d265f100e2c.tmp: Operation not permitted
wkqchen20 commented 4 months ago

can you share the logs and alluixo-site.properties

sure. i am not using alluxio-site.properties, all the configuration is in alluxio-configmap.

alluxio-configmap

apiVersion: v1
kind: ConfigMap
metadata:
  name: alluxio-config
  labels:
    name: alluxio-config
    app: alluxio
    chart: alluxio-0.6.54
    release: release-name
    heritage: Helm
data:
  ALLUXIO_JAVA_OPTS: |-
    -Dalluxio.master.hostname=alluxio-master-0 -Dalluxio.master.journal.type=EMBEDDED -Dalluxio.master.journal.folder=/journal -Dalluxio.master.mount.table.root.ufs=/data -Dalluxio.security.stale.channel.purge.interval=365d -XX:+UseContainerSupport -Dalluxio.security.authorization.permission.enabled=false
  ALLUXIO_MASTER_JAVA_OPTS: |-
    -Dalluxio.master.hostname=${ALLUXIO_MASTER_HOSTNAME} -Dalluxio.master.metastore=ROCKS -Dalluxio.master.metastore.dir=/metastore 
  ALLUXIO_JOB_MASTER_JAVA_OPTS: |-
    -Dalluxio.master.hostname=${ALLUXIO_MASTER_HOSTNAME} 
  ALLUXIO_WORKER_JAVA_OPTS: |-
    -Dalluxio.worker.hostname=${ALLUXIO_WORKER_HOSTNAME} -Dalluxio.worker.rpc.port=29999 -Dalluxio.worker.web.port=30000 -Dalluxio.worker.data.server.domain.socket.address=/opt/domain -Dalluxio.worker.data.server.domain.socket.as.uuid=true -Dalluxio.worker.container.hostname=${ALLUXIO_WORKER_CONTAINER_HOSTNAME} -Dalluxio.worker.ramdisk.size=2Gi -Dalluxio.worker.tieredstore.levels=1 -Dalluxio.worker.tieredstore.level0.dirs.mediumtype=MEM -Dalluxio.worker.tieredstore.level0.dirs.path=/dev/shm -Dalluxio.worker.tieredstore.level0.dirs.quota=1Gi 
  ALLUXIO_PROXY_JAVA_OPTS: |-

  ALLUXIO_JOB_WORKER_JAVA_OPTS: |-
    -Dalluxio.worker.hostname=${ALLUXIO_WORKER_HOSTNAME} -Dalluxio.job.worker.rpc.port=30001 -Dalluxio.job.worker.data.port=30002 -Dalluxio.job.worker.web.port=30003 
  ALLUXIO_FUSE_JAVA_OPTS: |-
    -Dalluxio.user.hostname=${ALLUXIO_CLIENT_HOSTNAME} -Dalluxio.fuse.mount.alluxio.path=/ -Dalluxio.fuse.mount.point=<nil> -Dalluxio.fuse.mount.options=allow_other -XX:MaxDirectMemorySize=2g 
  ALLUXIO_WORKER_TIEREDSTORE_LEVEL0_DIRS_PATH: /dev/shm

the mounted path /data like this

...
volumeMounts:
  - name: "nfs-data"
    mountPath: "/data"
...
volumes:
  - name: "nfs-data"
    hostPath:
      path: /mnt/nfs/
      type: DirectoryOrCreate

...

after create a pod, nodeplugin csi-nodeserver logs

I0719 03:21:57.762634       7 utils.go:97] GRPC call: /csi.v1.Node/NodeGetCapabilities
I0719 03:21:57.762661       7 utils.go:98] GRPC request: {}
I0719 03:21:57.762747       7 utils.go:103] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}}]}
I0719 03:21:57.764513       7 utils.go:97] GRPC call: /csi.v1.Node/NodeGetCapabilities
I0719 03:21:57.764531       7 utils.go:98] GRPC request: {}
I0719 03:21:57.764593       7 utils.go:103] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}}]}
I0719 03:21:57.765934       7 utils.go:97] GRPC call: /csi.v1.Node/NodeGetCapabilities
I0719 03:21:57.765955       7 utils.go:98] GRPC request: {}
I0719 03:21:57.766043       7 utils.go:103] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}}]}
I0719 03:21:57.767614       7 utils.go:97] GRPC call: /csi.v1.Node/NodeStageVolume
I0719 03:21:57.767632       7 utils.go:98] GRPC request: {"staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/alluxio/5d86871b55910dfc48022b2146ca59cedc3c9f3829873993469ec324562970ef/globalmount","volume_capability":{"AccessType":{"Mount":{"mount_flags":["direct_io","allow_other","entry_timeout=36000","attr_timeout=36000","max_readahead=0"]}},"access_mode":{"mode":5}},"volume_context":{"alluxioPath":"/","javaOptions":"-Dalluxio.user.metadata.cache.enabled=true","mountInPod":"false","storage.kubernetes.io/csiProvisionerIdentity":"1721358293111-8081-alluxio"},"volume_id":"alluxio-fdaef77f-abd6-4e46-98e8-0955f3433e6f"}
I0719 03:21:57.767927       7 utils.go:103] GRPC response: {}
I0719 03:21:57.769048       7 utils.go:97] GRPC call: /csi.v1.Node/NodeGetCapabilities
I0719 03:21:57.769085       7 utils.go:98] GRPC request: {}
I0719 03:21:57.769178       7 utils.go:103] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}}]}
I0719 03:21:57.770281       7 utils.go:97] GRPC call: /csi.v1.Node/NodeGetCapabilities
I0719 03:21:57.770299       7 utils.go:98] GRPC request: {}
I0719 03:21:57.770355       7 utils.go:103] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}}]}
I0719 03:21:57.771271       7 utils.go:97] GRPC call: /csi.v1.Node/NodeGetCapabilities
I0719 03:21:57.771295       7 utils.go:98] GRPC request: {}
I0719 03:21:57.771356       7 utils.go:103] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}}]}
I0719 03:21:57.772678       7 utils.go:97] GRPC call: /csi.v1.Node/NodePublishVolume
I0719 03:21:57.772705       7 utils.go:98] GRPC request: {"staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/alluxio/5d86871b55910dfc48022b2146ca59cedc3c9f3829873993469ec324562970ef/globalmount","target_path":"/var/lib/kubelet/pods/b441d709-86cc-4a67-b1b0-4e1568cff0a2/volumes/kubernetes.io~csi/alluxio-fdaef77f-abd6-4e46-98e8-0955f3433e6f/mount","volume_capability":{"AccessType":{"Mount":{"mount_flags":["direct_io","allow_other","entry_timeout=36000","attr_timeout=36000","max_readahead=0"]}},"access_mode":{"mode":5}},"volume_context":{"alluxioPath":"/","csi.storage.k8s.io/ephemeral":"false","csi.storage.k8s.io/pod.name":"app","csi.storage.k8s.io/pod.namespace":"lxy","csi.storage.k8s.io/pod.uid":"b441d709-86cc-4a67-b1b0-4e1568cff0a2","csi.storage.k8s.io/serviceAccount.name":"default","javaOptions":"-Dalluxio.user.metadata.cache.enabled=true","mountInPod":"false","storage.kubernetes.io/csiProvisionerIdentity":"1721358293111-8081-alluxio"},"volume_id":"alluxio-fdaef77f-abd6-4e46-98e8-0955f3433e6f"}
I0719 03:21:57.773041       7 nodeserver.go:61] Mount Alluxio to target path (pod volume path) with AlluxioFuse in CSI node server.
I0719 03:21:57.775598       7 main.go:92] Signal received: child exited
I0719 03:21:57.775777       7 nodeserver.go:108] /opt/alluxio/integration/fuse/bin/alluxio-fuse mount -o direct_io,allow_other,entry_timeout=36000,attr_timeout=36000,max_readahead=0 /var/lib/kubelet/pods/b441d709-86cc-4a67-b1b0-4e1568cff0a2/volumes/kubernetes.io~csi/alluxio-fdaef77f-abd6-4e46-98e8-0955f3433e6f/mount /
I0719 03:21:59.859522       7 nodeserver.go:110] Path /var/lib/kubelet/pods/b441d709-86cc-4a67-b1b0-4e1568cff0a2/volumes/kubernetes.io~csi/alluxio-fdaef77f-abd6-4e46-98e8-0955f3433e6f/mount is not mounted
Starting AlluxioFuse process: mounting alluxio path "/" to local mount point "/var/lib/kubelet/pods/b441d709-86cc-4a67-b1b0-4e1568cff0a2/volumes/kubernetes.io~csi/alluxio-fdaef77f-abd6-4e46-98e8-0955f3433e6f/mount"
Successfully mounted Alluxio to /var/lib/kubelet/pods/b441d709-86cc-4a67-b1b0-4e1568cff0a2/volumes/kubernetes.io~csi/alluxio-fdaef77f-abd6-4e46-98e8-0955f3433e6f/mount.
See /opt/alluxio-2.9.5/logs/fuse.log for logging messages

after touch test.log file. the alluxio-master logs

2024-07-19 03:27:24,855 WARN  [Persist-Checker-12](DefaultFileSystemMaster.java:4977) - Unexpected exception encountered when trying to complete persistence of a file /test.log (id=469762047) : java.nio.file.FileSystemException: /data/.alluxio_ufs_persistence/test.log.alluxio.1721359642828.9f3e63dc-8d55-4e0a-9bef-8d265f100e2c.tmp: Operation not permitted

Looks like a permissions issue? @jasondrogba

jasondrogba commented 4 months ago

yes, I also think it’s permission issue.

wkqchen20 commented 4 months ago

if you hava any idea about this ?

jasondrogba commented 4 months ago

Try setting permissions on your ufs It seems that root permissions are required to write

root@zsy-test-001:/data# ls -l 
total 1
drwxr-xr-x 1 root root 24 Jul 15 10:00 default_tests_files
-rw-r--r-- 1 root root  0 Jul 16 02:31 test.txt
root@zsy-test-001:/data# echo 'test' >> test.txt
wkqchen20 commented 4 months ago

I have changed the master/worker yaml setting, and everything warks fine.

...
securityContext:
  runAsUser: 0
  runAsGroup: 0
  fsGroup: 0
...

thanks a lot :) @jasondrogba