cvmfs-contrib / cvmfs-csi

CSI driver for CernVM-FS
Apache License 2.0
19 stars 29 forks source link

Do I need or can I use a squid cache proxy for the cvmfs-csi? #49

Open johnstile opened 1 year ago

johnstile commented 1 year ago

Do I need or can I use a squid cache proxy for the cvmfs-csi?

I do not see any mention of a squid proxy for this cvmfs solution.

In the docs I see the mention of "alien cache volume" but I do not know what that is.

gman0 commented 1 year ago

@johnstile alien (or external) cache is a shared scratch space for CVMFS clients to store the pulled data. It is to be used as a drop-in replacement for the regular local cache which is always used by a single client only. Alien cache on the other hand can be used by multiple clients at the same time -- this requires the volume for the cache to be backed by some shared filesystem though, because of concurrent writes from possibly different nodes. You can check https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#alien-cache for more information.

You can use squid and alien caches in conjunction with this too, AFAIK.

johnstile commented 1 year ago

Trying to enable alien caches, the values.yaml has no effect:

cache:
  alien:
    enable: true

I am expecting to find 'mountPath: /cvmfs-aliencache'

kubectl get pods cvmfs-csi-nodeplugin-nvklg -o yaml |grep mountPath
    - mountPath: /csi
    - mountPath: /registration
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
    - mountPath: /var/lib/kubelet/plugins/cvmfs.csi.cern.ch
    - mountPath: /var/lib/kubelet/pods
    - mountPath: /sys
    - mountPath: /lib/modules
    - mountPath: /dev
    - mountPath: /cvmfs-localcache
    - mountPath: /etc/cvmfs/default.local
    - mountPath: /etc/cvmfs/config.d
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
gman0 commented 1 year ago

Hmm, works for me:

helm install cvmfs cern/cvmfs-csi --set-json cache='{"alien": {"enabled": true, "volumeSpec": {"persistentVolumeClaim": {"claimName": "new-cephfs-share-pvc"}}}}'
$ kubectl get pod/cvmfs-cvmfs-csi-nodeplugin-t6lhj -o yaml | grep mountPath
    - mountPath: /csi
    - mountPath: /registration
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
    - mountPath: /var/lib/kubelet/plugins/cvmfs.csi.cern.ch
    - mountPath: /var/lib/kubelet/pods
    - mountPath: /sys
    - mountPath: /lib/modules
    - mountPath: /dev
    - mountPath: /cvmfs-localcache
    - mountPath: /cvmfs-aliencache
    - mountPath: /etc/cvmfs/default.local
    - mountPath: /etc/cvmfs/config.d
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount

And the alien cache volume is being populated with data after running the examples:

$ kubectl exec -it cvmfs-cvmfs-csi-nodeplugin-t6lhj -c nodeplugin -- ls /cvmfs-aliencache
00  42  84  c6
01  43  85  c7
02  44  86  c8
03  45  87  c9
04  46  88  ca
05  47  89  cb
06  48  8a  cc
07  49  8b  cd
...
gman0 commented 1 year ago

I think you have a typo in the key name, it should be "enabled", not "enable".

johnstile commented 1 year ago

Thank you! That helped me to the next issue.

With alien cache enabled, I had to create a pvc for the nodeplugin to start, but mounts fail when it is enabled.

This is the pvc I added: cvmfs-csi/templates/nodeplugin-alien-cache.pvc.yaml

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: cvmfs-alien-cache
spec:
  storageClassName: ceph-cephfs-external
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 10Gi

The alien cache volume is being populated

kubectl exec -it cvmfs-csi-nodeplugin-ht6tn -c nodeplugin -- ls /cvmfs-aliencache/
00  0f  1e  2d  3c  4b  5a  69  78  87  96  a5  b4  c3  d2  e1  f0  ff
01  10  1f  2e  3d  4c  5b  6a  79  88  97  a6  b5  c4  d3  e2  f1  quarantaine
02  11  20  2f  3e  4d  5c  6b  7a  89  98  a7  b6  c5  d4  e3  f2  txn
...

In the nodeplugin pod I see 3 directories:

[root@cvmfs-csi-nodeplugin-ht6tn /]# ls -tlr / |grep cvmfs
-rwxr-xr-x    1 root root 9252864 Oct 14 09:01 csi-cvmfsplugin
drwxrwxrwx  260 root root     259 Nov 23 00:48 cvmfs-aliencache
drwxrwxrwx    6 root root     117 Nov 23 01:16 cvmfs-localcache
drwxr-xr-x    2 root root       0 Nov 23 20:58 cvmfs

Contents of the other two:

cvmfs-localcache contains total 4
drwxrwxrwx  6 root  root  117 Nov 23 01:16 ./
drwxr-xr-x 21 root  root 4096 Nov 23 00:48 ../
drwxr-xr-x  2 cvmfs root   18 Nov 23 20:59 alice.cern.ch/
drwxr-xr-x  2 cvmfs root   18 Nov 23 02:21 cernvm-prod.cern.ch/
drwxr-xr-x  2 cvmfs root   18 Nov 23 20:59 cvmfs-config.cern.ch/
drwxr-xr-x  2 cvmfs root   18 Nov 23 01:15 icecube.opensciencegrid.org/

[root@cvmfs-csi-nodeplugin-ht6tn /]# ls -alF cvmfs
total 4
drwxr-xr-x  2 root root    0 Nov 23 21:01 ./
drwxr-xr-x 21 root root 4096 Nov 23 00:48 ../

I think I should be able to do this to mount:

ls /cvmfs/alice.cern.ch/

gman0 commented 1 year ago

With alien cache enabled, I had to create a pvc for the nodeplugin to start, but mounts fail when it is enabled.

Yes, cluster admin is expected to provide a reference to an existing PVC (or anything that's accepted in Pod.spec.volumes).

but mounts fail when it is enabled.

Is this when running ls /cvmfs/alice.cern.ch/? Any error message from that?

johnstile commented 1 year ago

I enabled the alien cache, tailed the cvmfs pod logs, and tried to ls /cvmfs/alice.cern.ch/

kubectl logs -f -l app=cvmfs-csi --all-containers  --max-log-requests 999

Before running ls

I1124 15:45:09.037651       1 leaderelection.go:278] successfully renewed lease cvmfs-csi-cern-ch
I1124 15:45:14.055464       1 leaderelection.go:278] successfully renewed lease cvmfs-csi-cern-ch
I1124 15:41:27.885062       1 grpcserver.go:143] Call-ID 2: Response: {"name":"cvmfs.csi.cern.ch","vendor_version":"v2.0.0"}
I1124 15:41:27.886153       1 grpcserver.go:136] Call-ID 3: Call: /csi.v1.Identity/GetPluginCapabilities
I1124 15:41:27.886201       1 grpcserver.go:137] Call-ID 3: Request: {}
I1124 15:41:27.886500       1 grpcserver.go:143] Call-ID 3: Response: {"capabilities":[{"Type":{"Service":{}}},{"Type":{"Service":{"type":1}}}]}
I1124 15:41:27.888109       1 grpcserver.go:136] Call-ID 4: Call: /csi.v1.Controller/ControllerGetCapabilities
I1124 15:41:27.888208       1 grpcserver.go:137] Call-ID 4: Request: {}
I1124 15:41:27.888483       1 grpcserver.go:143] Call-ID 4: Response: {"capabilities":[{"Type":{"Rpc":{"type":1}}}]}
I1124 15:41:28.124020       1 grpcserver.go:136] Call-ID 5: Call: /csi.v1.Controller/CreateVolume
I1124 15:41:28.124396       1 grpcserver.go:137] Call-ID 5: Request: {"capacity_range":{"required_bytes":1},"name":"pvc-0ca52b90-cf49-4afe-aebb-7577d7f217e9","volume_capabilities":[{"AccessType":{"Mount":{}},"access_mode":{"mode":3}}]}
I1124 15:41:28.124497       1 grpcserver.go:143] Call-ID 5: Response: {"volume":{"volume_id":"pvc-0ca52b90-cf49-4afe-aebb-7577d7f217e9"}}
I1124 15:41:53.844238 3206176 connection.go:184] GRPC request: {}
         the device is found by lsof(8) or fuser(1))
E1124 15:43:33.730684 3206376 grpcserver.go:141] Call-ID 7: Error: rpc error: code = Internal desc = failed to unmount /var/lib/kubelet/pods/733edd41-8f83-4ffd-a82b-d7ce2a7a9053/volumes/kubernetes.io~csi/pvc-cd134705-ad9a-476b-9478-e11422947e1d/mount: exit status 32
I1124 15:44:03.078824 3206376 grpcserver.go:136] Call-ID 8: Call: /csi.v1.Node/NodeUnpublishVolume
I1124 15:44:03.078951 3206376 grpcserver.go:137] Call-ID 8: Request: {"target_path":"/var/lib/kubelet/pods/16b25ad8-fcf8-48ce-8016-5a33a5e64048/volumes/kubernetes.io~csi/pvc-39c54dea-85c6-4272-8c28-6418dcf699d0/mount","volume_id":"pvc-39c54dea-85c6-4272-8c28-6418dcf699d0"}
I1124 15:41:53.856768 3206176 connection.go:186] GRPC response: {"name":"cvmfs.csi.cern.ch","vendor_version":"v2.0.0"}
I1124 15:41:53.856885 3206176 connection.go:187] GRPC error: <nil>
I1124 15:44:03.079050 3206376 mountutil.go:99] Exec-ID 9: Running command env=[] prog=/usr/bin/umount args=[umount --recursive /var/lib/kubelet/pods/16b25ad8-fcf8-48ce-8016-5a33a5e64048/volumes/kubernetes.io~csi/pvc-39c54dea-85c6-4272-8c28-6418dcf699d0/mount]
I1124 15:44:03.098169 3206376 mountutil.go:99] Exec-ID 9: Process exited: exit status 32
E1124 15:44:03.098195 3206376 mountutil.go:99] Exec-ID 9: Error: exit status 32; Output: umount: /var/lib/kubelet/pods/16b25ad8-fcf8-48ce-8016-5a33a5e64048/volumes/kubernetes.io~csi/pvc-39c54dea-85c6-4272-8c28-6418dcf699d0/mount: target is busy.
        (In some cases useful info about processes that use
         the device is found by lsof(8) or fuser(1))
E1124 15:44:03.098240 3206376 grpcserver.go:141] Call-ID 8: Error: rpc error: code = Internal desc = failed to unmount /var/lib/kubelet/pods/16b25ad8-fcf8-48ce-8016-5a33a5e64048/volumes/kubernetes.io~csi/pvc-39c54dea-85c6-4272-8c28-6418dcf699d0/mount: exit status 32
I1124 15:41:53.856895 3206176 main.go:208] CSI driver name: "cvmfs.csi.cern.ch"
I1124 15:41:53.856925 3206176 node_register.go:53] Starting Registration Server at: /registration/cvmfs.csi.cern.ch-reg.sock
I1124 15:41:53.857152 3206176 node_register.go:62] Registration Server started at: /registration/cvmfs.csi.cern.ch-reg.sock
I1124 15:41:53.857238 3206176 node_register.go:92] Skipping HTTP server because endpoint is set to: ""
I1124 15:41:54.452320 3206176 main.go:102] Received GetInfo call: &InfoRequest{}
I1124 15:41:54.452709 3206176 main.go:109] "Kubelet registration probe created" path="/var/lib/kubelet/plugins/cvmfs.csi.cern.ch/registration"
I1124 15:41:54.491700 3206176 main.go:120] Received NotifyRegistrationStatus call: &RegistrationStatus{PluginRegistered:true,Error:,}
I1124 15:45:19.094428       1 leaderelection.go:278] successfully renewed lease cvmfs-csi-cern-ch
I1124 15:45:20.152833 3206376 grpcserver.go:136] Call-ID 9: Call: /csi.v1.Node/NodeUnpublishVolume
I1124 15:45:20.152917 3206376 grpcserver.go:137] Call-ID 9: Request: {"target_path":"/var/lib/kubelet/pods/91de2bd5-27c8-43d2-8602-8cccf009aac6/volumes/kubernetes.io~csi/pvc-6697db80-d988-45ba-8dd6-351b505919d7/mount","volume_id":"pvc-6697db80-d988-45ba-8dd6-351b505919d7"}
I1124 15:45:20.153001 3206376 mountutil.go:99] Exec-ID 10: Running command env=[] prog=/usr/bin/umount args=[umount --recursive /var/lib/kubelet/pods/91de2bd5-27c8-43d2-8602-8cccf009aac6/volumes/kubernetes.io~csi/pvc-6697db80-d988-45ba-8dd6-351b505919d7/mount]
I1124 15:45:20.160797 3206376 mountutil.go:99] Exec-ID 10: Process exited: exit status 32
E1124 15:45:20.160819 3206376 mountutil.go:99] Exec-ID 10: Error: exit status 32; Output: umount: /var/lib/kubelet/pods/91de2bd5-27c8-43d2-8602-8cccf009aac6/volumes/kubernetes.io~csi/pvc-6697db80-d988-45ba-8dd6-351b505919d7/mount: target is busy.
        (In some cases useful info about processes that use
         the device is found by lsof(8) or fuser(1))
E1124 15:45:20.160851 3206376 grpcserver.go:141] Call-ID 9: Error: rpc error: code = Internal desc = failed to unmount /var/lib/kubelet/pods/91de2bd5-27c8-43d2-8602-8cccf009aac6/volumes/kubernetes.io~csi/pvc-6697db80-d988-45ba-8dd6-351b505919d7/mount: exit status 32

I1124 15:45:24.111538       1 leaderelection.go:278] successfully renewed lease cvmfs-csi-cern-ch
I1124 15:45:29.134879       1 leaderelection.go:278] successfully renewed lease cvmfs-csi-cern-ch
I1124 15:45:34.154705       1 leaderelection.go:278] successfully renewed lease cvmfs-csi-cern-ch
I1124 15:45:35.797265 3206376 grpcserver.go:136] Call-ID 10: Call: /csi.v1.Node/NodeUnpublishVolume
I1124 15:45:35.797353 3206376 grpcserver.go:137] Call-ID 10: Request: {"target_path":"/var/lib/kubelet/pods/733edd41-8f83-4ffd-a82b-d7ce2a7a9053/volumes/kubernetes.io~csi/pvc-cd134705-ad9a-476b-9478-e11422947e1d/mount","volume_id":"pvc-cd134705-ad9a-476b-9478-e11422947e1d"}
I1124 15:45:35.797430 3206376 mountutil.go:99] Exec-ID 11: Running command env=[] prog=/usr/bin/umount args=[umount --recursive /var/lib/kubelet/pods/733edd41-8f83-4ffd-a82b-d7ce2a7a9053/volumes/kubernetes.io~csi/pvc-cd134705-ad9a-476b-9478-e11422947e1d/mount]
I1124 15:45:35.805216 3206376 mountutil.go:99] Exec-ID 11: Process exited: exit status 32
E1124 15:45:35.805234 3206376 mountutil.go:99] Exec-ID 11: Error: exit status 32; Output: umount: /var/lib/kubelet/pods/733edd41-8f83-4ffd-a82b-d7ce2a7a9053/volumes/kubernetes.io~csi/pvc-cd134705-ad9a-476b-9478-e11422947e1d/mount: target is busy.
        (In some cases useful info about processes that use
         the device is found by lsof(8) or fuser(1))
E1124 15:45:35.805257 3206376 grpcserver.go:141] Call-ID 10: Error: rpc error: code = Internal desc = failed to unmount /var/lib/kubelet/pods/733edd41-8f83-4ffd-a82b-d7ce2a7a9053/volumes/kubernetes.io~csi/pvc-cd134705-ad9a-476b-9478-e11422947e1d/mount: exit status 32
I1124 15:45:39.177028       1 leaderelection.go:278] successfully renewed lease cvmfs-csi-cern-ch
I1124 15:45:44.201155       1 leaderelection.go:278] successfully renewed lease cvmfs-csi-cern-ch
I1124 15:45:49.224843       1 leaderelection.go:278] successfully renewed lease cvmfs-csi-cern-ch

Ran ls

[root@cvmfs-csi-nodeplugin-xgkqh /]# ls /cvmfs/alice.cern.ch/
ls: cannot access /cvmfs/alice.cern.ch/: No such file or director

Log

I1124 15:47:29.630304       1 leaderelection.go:278] successfully renewed lease cvmfs-csi-cern-ch
I1124 15:47:34.648075       1 leaderelection.go:278] successfully renewed lease cvmfs-csi-cern-ch
I1124 15:47:37.865017 3206376 grpcserver.go:136] Call-ID 13: Call: /csi.v1.Node/NodeUnpublishVolume
I1124 15:47:37.865099 3206376 grpcserver.go:137] Call-ID 13: Request: {"target_path":"/var/lib/kubelet/pods/733edd41-8f83-4ffd-a82b-d7ce2a7a9053/volumes/kubernetes.io~csi/pvc-cd134705-ad9a-476b-9478-e11422947e1d/mount","volume_id":"pvc-cd134705-ad9a-476b-9478-e11422947e1d"}
I1124 15:47:37.865194 3206376 mountutil.go:99] Exec-ID 14: Running command env=[] prog=/usr/bin/umount args=[umount --recursive /var/lib/kubelet/pods/733edd41-8f83-4ffd-a82b-d7ce2a7a9053/volumes/kubernetes.io~csi/pvc-cd134705-ad9a-476b-9478-e11422947e1d/mount]
I1124 15:47:37.873628 3206376 mountutil.go:99] Exec-ID 14: Process exited: exit status 32
E1124 15:47:37.874236 3206376 mountutil.go:99] Exec-ID 14: Error: exit status 32; Output: umount: /var/lib/kubelet/pods/733edd41-8f83-4ffd-a82b-d7ce2a7a9053/volumes/kubernetes.io~csi/pvc-cd134705-ad9a-476b-9478-e11422947e1d/mount: target is busy.
        (In some cases useful info about processes that use
         the device is found by lsof(8) or fuser(1))
E1124 15:47:37.874305 3206376 grpcserver.go:141] Call-ID 13: Error: rpc error: code = Internal desc = failed to unmount /var/lib/kubelet/pods/733edd41-8f83-4ffd-a82b-d7ce2a7a9053/volumes/kubernetes.io~csi/pvc-cd134705-ad9a-476b-9478-e11422947e1d/mount: exit status 32
I1124 15:47:39.665927       1 leaderelection.go:278] successfully renewed lease cvmfs-csi-cern-ch
I1124 15:47:44.685175       1 leaderelection.go:278] successfully renewed lease cvmfs-csi-cern-ch
I1124 15:47:49.709244       1 leaderelection.go:278] successfully renewed lease cvmfs-csi-cern-ch
I1124 15:47:54.726785       1 leaderelection.go:278] successfully renewed lease cvmfs-csi-cern-ch
I1124 15:47:59.746494       1 leaderelection.go:278] successfully renewed lease cvmfs-csi-cern-ch
I1124 15:48:04.769786       1 leaderelection.go:278] successfully renewed lease cvmfs-csi-cern-ch
I1124 15:48:07.245416 3206376 grpcserver.go:136] Call-ID 14: Call: /csi.v1.Node/NodeUnpublishVolume
I1124 15:48:07.245518 3206376 grpcserver.go:137] Call-ID 14: Request: {"target_path":"/var/lib/kubelet/pods/16b25ad8-fcf8-48ce-8016-5a33a5e64048/volumes/kubernetes.io~csi/pvc-39c54dea-85c6-4272-8c28-6418dcf699d0/mount","volume_id":"pvc-39c54dea-85c6-4272-8c28-6418dcf699d0"}
I1124 15:48:07.245629 3206376 mountutil.go:99] Exec-ID 15: Running command env=[] prog=/usr/bin/umount args=[umount --recursive /var/lib/kubelet/pods/16b25ad8-fcf8-48ce-8016-5a33a5e64048/volumes/kubernetes.io~csi/pvc-39c54dea-85c6-4272-8c28-6418dcf699d0/mount]
I1124 15:48:07.253174 3206376 mountutil.go:99] Exec-ID 15: Process exited: exit status 32
E1124 15:48:07.253201 3206376 mountutil.go:99] Exec-ID 15: Error: exit status 32; Output: umount: /var/lib/kubelet/pods/16b25ad8-fcf8-48ce-8016-5a33a5e64048/volumes/kubernetes.io~csi/pvc-39c54dea-85c6-4272-8c28-6418dcf699d0/mount: target is busy.
        (In some cases useful info about processes that use
         the device is found by lsof(8) or fuser(1))
E1124 15:48:07.253229 3206376 grpcserver.go:141] Call-ID 14: Error: rpc error: code = Internal desc = failed to unmount /var/lib/kubelet/pods/16b25ad8-fcf8-48ce-8016-5a33a5e64048/volumes/kubernetes.io~csi/pvc-39c54dea-85c6-4272-8c28-6418dcf699d0/mount: exit status 32
I1124 15:48:09.789263       1 leaderelection.go:278] successfully renewed lease cvmfs-csi-cern-ch
I1124 15:48:14.812620       1 leaderelection.go:278] successfully renewed lease cvmfs-csi-cern-ch
I1124 15:48:19.831568       1 leaderelection.go:278] successfully renewed lease cvmfs-csi-cern-ch
I1124 15:48:24.846976       1 leaderelection.go:278] successfully renewed lease cvmfs-csi-cern-ch
johnstile commented 1 year ago

The configs

[root@cvmfs-csi-nodeplugin-25crr /]# egrep -v '^#|^$' /etc/cvmfs/default.conf 
CVMFS_CLIENT_PROFILE=
CVMFS_CACHE_BASE=/var/lib/cvmfs
CVMFS_QUOTA_LIMIT=4000
CVMFS_TIMEOUT=5
CVMFS_TIMEOUT_DIRECT=10
CVMFS_STRICT_MOUNT=no
CVMFS_NFILES=131072
CVMFS_SHARED_CACHE=yes
CVMFS_CHECK_PERMISSIONS=yes
CVMFS_CLAIM_OWNERSHIP=yes
CVMFS_KEYS_DIR=/etc/cvmfs/keys
CVMFS_PROXY_RESET_AFTER=300
CVMFS_HOST_RESET_AFTER=1800
CVMFS_MAX_RETRIES=1
CVMFS_BACKOFF_INIT=2
CVMFS_BACKOFF_MAX=10
CVMFS_SEND_INFO_HEADER=no
CVMFS_USE_GEOAPI=no
CVMFS_LOW_SPEED_LIMIT=1024
if [ "x$CVMFS_BASE_ENV" = "x" ]; then
  readonly CVMFS_USER=cvmfs
  readonly CVMFS_MOUNT_DIR=/cvmfs
  readonly CVMFS_RELOAD_SOCKETS=/var/run/cvmfs
  readonly CVMFS_BASE_ENV=1
fi
egrep -v '^#|^$' /etc/cvmfs/default.local
CVMFS_USE_GEOAPI=yes
CVMFS_REPOSITORIES=alice.cern.ch
CVMFS_QUOTA_LIMIT=49600
CVMFS_CACHE_BASE=/cvmfs-localcache
CVMFS_NFS_SOURCE=yes
CVMFS_MEMCACHE_SIZE=1024
CVMFS_HIDE_MAGIC_XATTRS=yes
CVMFS_ALIEN_CACHE=/cvmfs-aliencache
CVMFS_QUOTA_LIMIT=-1
CVMFS_SHARED_CACHE=no

I changed CVMFS_QUOTA_LIMIT to -1 because, "The CVMFS_ALIEN_CACHE requires CVMFS_QUOTA_LIMIT=-1 and CVMFS_SHARED_CACHE=no" seen here: https://cvmfs.readthedocs.io/en/stable/cpt-configure.html

johnstile commented 1 year ago

The following is a log for ls /cvmfs/alice.cern.ch with the alien cache turned off, which succeeds.

I1128 16:55:48.510865       1 leaderelection.go:278] successfully renewed lease cvmfs-csi-cern-ch
I1128 16:55:50.396634       1 reflector.go:536] sigs.k8s.io/sig-storage-lib-external-provisioner/v8/controller/controller.go:845: Watch close - *v1.PersistentVolume total 3 items received
I1128 16:55:53.529217       1 leaderelection.go:278] successfully renewed lease cvmfs-csi-cern-ch
I1128 16:55:54.289490       1 reflector.go:382] k8s.io/client-go/informers/factory.go:134: forcing resync
I1128 16:55:54.393899       1 reflector.go:382] sigs.k8s.io/sig-storage-lib-external-provisioner/v8/controller/controller.go:845: forcing resync
I1128 16:55:55.887486 1677449 grpcserver.go:136] Call-ID 24: Call: /csi.v1.Node/NodeUnpublishVolume
I1128 16:55:55.887573 1677449 grpcserver.go:137] Call-ID 24: Request: {"target_path":"/var/lib/kubelet/pods/91de2bd5-27c8-43d2-8602-8cccf009aac6/volumes/kubernetes.io~csi/pvc-6697db80-d988-45ba-8dd6-351b505919d7/mount","volume_id":"pvc-6697db80-d988-45ba-8dd6-351b505919d7"}
I1128 16:55:55.887663 1677449 mountutil.go:99] Exec-ID 26: Running command env=[] prog=/usr/bin/umount args=[umount --recursive /var/lib/kubelet/pods/91de2bd5-27c8-43d2-8602-8cccf009aac6/volumes/kubernetes.io~csi/pvc-6697db80-d988-45ba-8dd6-351b505919d7/mount]
I1128 16:55:55.894875 1677449 mountutil.go:99] Exec-ID 26: Process exited: exit status 32
E1128 16:55:55.894895 1677449 mountutil.go:99] Exec-ID 26: Error: exit status 32; Output: umount: /var/lib/kubelet/pods/91de2bd5-27c8-43d2-8602-8cccf009aac6/volumes/kubernetes.io~csi/pvc-6697db80-d988-45ba-8dd6-351b505919d7/mount: target is busy.
        (In some cases useful info about processes that use
         the device is found by lsof(8) or fuser(1))
E1128 16:55:55.894921 1677449 grpcserver.go:141] Call-ID 24: Error: rpc error: code = Internal desc = failed to unmount /var/lib/kubelet/pods/91de2bd5-27c8-43d2-8602-8cccf009aac6/volumes/kubernetes.io~csi/pvc-6697db80-d988-45ba-8dd6-351b505919d7/mount: exit status 32
I1128 16:55:58.548414       1 leaderelection.go:278] successfully renewed lease cvmfs-csi-cern-ch
I1128 16:56:03.566935       1 leaderelection.go:278] successfully renewed lease cvmfs-csi-cern-ch
johnstile commented 1 year ago

With a test pod, I verified I see persistent data in the pvc cvmfs-alien-cache, verified by touching a file, taking down the pod, and creating a new pod, and I see the data persist.

apiVersion: v1
kind: Pod
metadata:
  name: test-pod
spec:
  containers:
    - name: test-pod
      image: registry.cern.ch/magnum/cvmfs-csi:v2.0.0
      imagePullPolicy: IfNotPresent
      command: [ "/bin/sh", "-c", "trap : TERM INT; (while true; do sleep 1000; done) & wait" ]
      volumeMounts:
      - mountPath: /test 
        name: alien-cache
  volumes:
    - name: alien-cache
      persistentVolumeClaim:
        claimName: cvmfs-alien-cache
johnstile commented 1 year ago

@gman0

In the nodeplugin container, running mount manually shows an error.

[root@cvmfs-csi-nodeplugin-5nbvg /]# mount -v -t cvmfs alice.cern.ch /tmp/foo

Debug: using library /usr/lib64/libcvmfs_fuse_stub.so
CernVM-FS: running with credentials 999:997
CernVM-FS: running in debug mode
CernVM-FS: loading Fuse module... (cvmfs) Parsing config file /etc/cvmfs/default.conf    [11-28-2022 19:51:23 UTC]
(cvmfs) execve'd /bin/sh (PID: 269282)    [11-28-2022 19:51:23 UTC]
(cvmfs) Parsing config file /etc/cvmfs/default.d/50-cern.conf    [11-28-2022 19:51:23 UTC]
(cvmfs) execve'd /bin/sh (PID: 269284)    [11-28-2022 19:51:23 UTC]
(cvmfs) Parsing config file /cvmfs/cvmfs-config.cern.ch/etc/cvmfs/default.conf    [11-28-2022 19:51:23 UTC]
(cvmfs) configuration repository directory does not exist: /cvmfs/cvmfs-config.cern.ch/etc/cvmfs    [11-28-2022 19:51:23 UTC]
(cvmfs) Parsing config file /etc/cvmfs/default.local    [11-28-2022 19:51:23 UTC]
(cvmfs) execve'd /bin/sh (PID: 269287)    [11-28-2022 19:51:23 UTC]
(cvmfs) Parsing config file /cvmfs/cvmfs-config.cern.ch/etc/cvmfs/domain.d/cern.ch.conf    [11-28-2022 19:51:23 UTC]
(cvmfs) configuration repository directory does not exist: /cvmfs/cvmfs-config.cern.ch/etc/cvmfs/domain.d    [11-28-2022 19:51:23 UTC]
(cvmfs) Parsing config file /etc/cvmfs/domain.d/cern.ch.conf    [11-28-2022 19:51:23 UTC]
(cvmfs) execve'd /bin/sh (PID: 269290)    [11-28-2022 19:51:23 UTC]
(cvmfs) Parsing config file /etc/cvmfs/domain.d/cern.ch.local    [11-28-2022 19:51:23 UTC]
(cvmfs) Parsing config file /cvmfs/cvmfs-config.cern.ch/etc/cvmfs/config.d/alice.cern.ch.conf    [11-28-2022 19:51:23 UTC]
(cvmfs) configuration repository directory does not exist: /cvmfs/cvmfs-config.cern.ch/etc/cvmfs/config.d    [11-28-2022 19:51:23 UTC]
(cvmfs) Parsing config file /etc/cvmfs/config.d/alice.cern.ch.conf    [11-28-2022 19:51:23 UTC]
(cvmfs) execve'd /bin/sh (PID: 269293)    [11-28-2022 19:51:23 UTC]
(cvmfs) Parsing config file /etc/cvmfs/config.d/alice.cern.ch.local    [11-28-2022 19:51:23 UTC]
Cache directory and workspace must be identical for NFS export (11 - NFS maps init failure)

When CVMFS_NFS_SOURCE=yes, I need the same path in both CVMFS_ALIEN_CACHE and CVMFS_WORKSPACE.

A change to cvmfs-csi/values.yaml in the alien cache conditional section

        {{- if .Values.cache.alien.enabled }}
        CVMFS_ALIEN_CACHE=/cvmfs-aliencache
        # When alien cache is used, CVMFS does not control the size of the cache.
        CVMFS_QUOTA_LIMIT=-1
        # Whether repositories should share a cache directory or each have their own.
        CVMFS_SHARED_CACHE=no
        # Cache directory and workspace must be identical for NFS export
        CVMFS_WORKSPACE=/cvmfs-aliencache
        {{- end -}}

Once CVMFS_WORKSPACE was set the mount works

gman0 commented 1 year ago

Thanks for looking into this, @johnstile! Are you using squid cache you mentioned earlier?

johnstile commented 1 year ago

@gman0

I have not reached the squid proxy yet (I guess that is next).

I am not sure I need a squid proxy if the alien cache is working (and now it is).

Currently, I have set CVMFS_HTTP_PROXY=DIRECT

What do you think?

gman0 commented 1 year ago

This really depends on your use case. Alien cache is "append-only", with squid you have a lot more control over how you store the data.

Since the alien cache is unmanaged, there is no automatic quota management provided by CernVM-FS; the alien cache directory is ever-growing. The CVMFS_ALIEN_CACHE requires CVMFS_QUOTA_LIMIT=-1 and CVMFS_SHARED_CACHE=no.

https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#alien-cache

For CVMFS specific questions I think you'll have better chances of getting qualified answers from https://cernvm-forum.cern.ch/