kadalu / kadalu

A lightweight Persistent storage solution for Kubernetes / OpenShift / Nomad using GlusterFS in background. More information at https://kadalu.tech
https://docs.kadalu.tech/k8s-storage/devel/quick-start/
Other
711 stars 99 forks source link

[Need Help]: Expect shared path for different pods across machine/nodes #1029

Closed liyuntao closed 9 months ago

liyuntao commented 11 months ago

Hey guys, I am trying to adopt kadalu with my existing External glusterFS server.

---
apiVersion: kadalu-operator.storage/v1alpha1
kind: KadaluStorage
metadata:
  name: pv-ext
spec:
  type: External
  single_pv_per_pool: true             # Also tried false either. not working
  details:
    gluster_hosts:
      - my-gluster-ip        
    gluster_volname: app             
    gluster_options: log-level=DEBUG   

---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: pv-ext
spec:
  storageClassName: kadalu.pv-ext
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 200Mi
---
apiVersion: v1
kind: Pod
metadata:
  name: pod-ext
  labels:
    app: sample-app
spec:
  containers:
  - name: sample-app
    image: docker.io/kadalu/sample-pv-check-app:latest
    imagePullPolicy: IfNotPresent
    volumeMounts:
    - mountPath: "/mnt/pv"
      name: csivol
  volumes:
  - name: csivol
    persistentVolumeClaim:
      claimName: pv-ext

The yaml is basically same as /example and quite simple. The PVC is being applied to two different namespaces. Let's say it ns-a and ns-b

But when I deploy two pods with pvc claimd in same namespace, the file under volume_path cannot be seen by each other, but either file is ok during pod delete & re-created. Below is my attempts:

  1. The mount point is same. e.g. '/mnt/pv'
  2. create podA under ns-a, kubectl exec -i podA touch /mnt/pv/a.txt
  3. create podB under ns-b, cannot find file a.txt under '/mnt/pv'.
  4. create file b.txt in podB. Also, the file could not be seen inside podA. It seems isolated under two different paths inside actual glusterFs volume.

Can anybody help me with it?

liyuntao commented 11 months ago

Maybe the key is auto-generated(dynamic provisioning??) path field in pv. One of my podA's bound PV is path: subvol/0c/43/pvc-6f67f3e8-6667-48b0-bdaf-cb5d8d1e21b2, which I want is a parent(root) mount path in GFS.

I've tried another way, manually create PV and PVC, using volumeName to make PVC bound to specific manually created PV. But only one PVC could bound to PV at the same time. This still doesn't satisfy my scene.

Found some clue: different files observed (same mount path within the different container) is related to nodes.

liyuntao commented 11 months ago

Here are logs in kadalu-csi-provisioner pod.

I1112 08:39:39.634455       1 controller.go:1453] delete "pvc-196cd4ab-a078-48ee-8a75-5fa5248d3be1": started
I1112 08:39:39.650111       1 controller.go:1468] delete "pvc-196cd4ab-a078-48ee-8a75-5fa5248d3be1": volume deleted
I1112 08:39:39.655746       1 controller.go:1518] delete "pvc-196cd4ab-a078-48ee-8a75-5fa5248d3be1": persistentvolume deleted
E1112 08:39:39.655784       1 controller.go:1521] couldn't create key for object pvc-196cd4ab-a078-48ee-8a75-5fa5248d3be1: object has no meta: object does not implement the Object interfaces
I1112 08:39:39.655798       1 controller.go:1523] delete "pvc-196cd4ab-a078-48ee-8a75-5fa5248d3be1": succeeded
I1112 08:53:51.876749       1 controller.go:1453] delete "pvc-e6e6004a-2c49-4415-8d3f-28d9f52d34fb": started
I1112 08:53:51.891759       1 controller.go:1468] delete "pvc-e6e6004a-2c49-4415-8d3f-28d9f52d34fb": volume deleted
I1112 08:53:51.897920       1 controller.go:1518] delete "pvc-e6e6004a-2c49-4415-8d3f-28d9f52d34fb": persistentvolume deleted
E1112 08:53:51.897939       1 controller.go:1521] couldn't create key for object pvc-e6e6004a-2c49-4415-8d3f-28d9f52d34fb: object has no meta: object does not implement the Object interfaces
I1112 08:53:51.897949       1 controller.go:1523] delete "pvc-e6e6004a-2c49-4415-8d3f-28d9f52d34fb": succeeded
I1112 08:53:57.469343       1 controller.go:1453] delete "pvc-78d19013-40a6-48d0-8c58-fc0e52cf564b": started
I1112 08:53:57.483747       1 controller.go:1468] delete "pvc-78d19013-40a6-48d0-8c58-fc0e52cf564b": volume deleted
I1112 08:53:57.489569       1 controller.go:1518] delete "pvc-78d19013-40a6-48d0-8c58-fc0e52cf564b": persistentvolume deleted
E1112 08:53:57.489587       1 controller.go:1521] couldn't create key for object pvc-78d19013-40a6-48d0-8c58-fc0e52cf564b: object has no meta: object does not implement the Object interfaces
I1112 08:53:57.489597       1 controller.go:1523] delete "pvc-78d19013-40a6-48d0-8c58-fc0e52cf564b": succeeded
I1112 08:54:02.546704       1 controller.go:1453] delete "pvc-43028eee-66ef-4080-8906-2c792439e375": started
I1112 08:54:02.561446       1 controller.go:1468] delete "pvc-43028eee-66ef-4080-8906-2c792439e375": volume deleted
I1112 08:54:02.567914       1 controller.go:1518] delete "pvc-43028eee-66ef-4080-8906-2c792439e375": persistentvolume deleted
E1112 08:54:02.567934       1 controller.go:1521] couldn't create key for object pvc-43028eee-66ef-4080-8906-2c792439e375: object has no meta: object does not implement the Object interfaces
I1112 08:54:02.567942       1 controller.go:1523] delete "pvc-43028eee-66ef-4080-8906-2c792439e375": succeeded
I1112 08:56:39.577861       1 controller.go:1317] provision "default/gfs-app-claim" class "kadalu.pv-gfs-app": started
I1112 08:56:39.578008       1 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"gfs-app-claim", UID:"f295ef04-fc81-467d-9bab-eb078aba9d7e", APIVersion:"v1", ResourceVersion:"1958065", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "default/gfs-app-claim"
I1112 08:57:33.650807       1 controller.go:1420] provision "default/gfs-app-claim" class "kadalu.pv-gfs-app": volume "pvc-f295ef04-fc81-467d-9bab-eb078aba9d7e" provisioned
I1112 08:57:33.650844       1 controller.go:1437] provision "default/gfs-app-claim" class "kadalu.pv-gfs-app": succeeded
E1112 08:57:33.655312       1 controller.go:1443] couldn't create key for object pvc-f295ef04-fc81-467d-9bab-eb078aba9d7e: object has no meta: object does not implement the Object interfaces
I1112 08:57:33.655336       1 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"gfs-app-claim", UID:"f295ef04-fc81-467d-9bab-eb078aba9d7e", APIVersion:"v1", ResourceVersion:"1958065", FieldPath:""}): type: 'Normal' reason: 'ProvisioningSucceeded' Successfully provisioned volume pvc-f295ef04-fc81-467d-9bab-eb078aba9d7e
liyuntao commented 11 months ago

Just find out I was using a wrong gluster volume name(not existed yet.)

After cleanup & change to right volume name, everything goes normal.

leelavg commented 11 months ago

After cleanup & change to right volume name, everything goes normal.

  • can the issue be closed then?
liyuntao commented 11 months ago

There are still pitfalls: if an invalid gluster_volname is filled in YAML by mistake, neither the pod itself nor the service owner would notice the error.

Any potential bugs here?

leelavg commented 11 months ago

ack, that can be a bug.

liyuntao commented 11 months ago

Hi @leelavg

With some determining work and local debugging, the problem may exist in file csi/volumeutils.py > handle_external_volume > mount_glusterfs_with_host.

When user try to mount via external glusterfs, and the actual mount command returns a non-zero code, current logical path may still return a non-empty mount value(from function arguments) to the upper caller, which breaks the error propagation. Would you like to review this commit