awslabs / mountpoint-s3-csi-driver

Built on Mountpoint for Amazon S3, the Mountpoint CSI driver presents an Amazon S3 bucket as a storage volume accessible by containers in your Kubernetes cluster.
Apache License 2.0
153 stars 18 forks source link

Simplify caching configuration #141

Open jjkr opened 5 months ago

jjkr commented 5 months ago

/feature

Is your feature request related to a problem? Please describe. Caching is supported today by adding a cache option to a persistent volume configuration and passing in a directory on the node's filesystem. This works, but comes with a couple sharp edges. Creating the directory on the node is not done automatically, so it has to be created manually ahead of time.

Describe the solution you'd like in detail Caching configuration should be possible without manually making changes to the nodes and should make it easy to define different types of storage to use as cache like a ramdisk.

Describe alternatives you've considered One potential solution is to reference other persistent volumes or mounts as cache, which could make for nice composability of the k8s constructs.

Additional context Mountpoint's documentation on caching: https://github.com/awslabs/mountpoint-s3/blob/main/doc/CONFIGURATION.md#caching-configuration

ggkr commented 3 months ago

We have the same issue and attempted to workaround it by using an init container to create the cache directory on the node like in the following example: (I didn't provide the pv config in this example, but it was configured to cache dir on /tmp/s3-cache)

apiVersion: v1
kind: Pod
metadata:
  name: s3-app
spec:
  initContainers:
  - name: create-cache-dir
    image: centos
    command: ["/bin/sh"]
    args: ["-c", "mkdir -p /cache-dir/s3-cache; echo 'hi' > /cache-dir/s3-cache/test.txt"]
    volumeMounts:
    - name: cache-location
      mountPath: /cache-dir
  containers:
  - name: app
    image: centos
    command: ["/bin/sh"]
    args: ["-c", "ls -lR /data; sleep 99"]
    volumeMounts:
    - name: persistent-storage
      mountPath: /data
  volumes:
  - name: persistent-storage
    persistentVolumeClaim:
      claimName: s3-pvc
  - name: cache-location
    hostPath:
      path: /tmp/

This example DOES NOT work - as k8s attempts to mount the s3 volume even before the init container.

terrytsay commented 2 months ago

Based on the example here: https://github.com/awslabs/mountpoint-s3-csi-driver/blob/main/examples/kubernetes/static_provisioning/caching.yaml

I worked around this issue by using a hostPath mount to create the directory (if not exist) on host.

Regardless of the order of volumeMount or volumes, it will automatically retry until it is mounted. But I put it before the pvc mount in case it does this in the order specified. From my testing, pod comes up immediately.

apiVersion: v1
kind: Pod
metadata:
  name: s3-app
spec:
  containers:
    - name: app
      image: centos
      command: ["/bin/sh"]
      args: ["-c", "echo 'Hello from the container!' >> /data/$(date -u).txt; tail -f /dev/null"]
      volumeMounts:
        - name: cache-location
          mountPath: /tmp/pv
        - name: persistent-storage
          mountPath: /data
  volumes:
    - name: cache-location
      hostPath:
        path: /tmp/s3-pv1-cache
        type: DirectoryOrCreate
    - name: persistent-storage
      persistentVolumeClaim:
        claimName: s3-claim
gyf304 commented 1 month ago

I'm working around this using a k8s job.

apiVersion: batch/v1
kind: Job
metadata:
  name: s3-cache-create
  namespace: kube-system
spec:
  template:
    spec:
      containers:
      - name: busybox
        image: busybox
        command:
        - mkdir
        - "-p"
        - /host/var/tmp/s3-cache
        volumeMounts:
        - name: host-var-tmp
          mountPath: /host/var/tmp
      volumes:
      - name: host-var-tmp
        hostPath:
          path: /var/tmp
      restartPolicy: Never

A job per volume is needed - and you should modify the path so that it is unique per volume.

mcandeia commented 1 week ago

This worked for me

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: s3-cache-dir-setup
  namespace: kube-system
spec:
  selector:
    matchLabels:
      app: s3-cache-dir-setup
  template:
    metadata:
      labels:
        app: s3-cache-dir-setup
    spec:
      initContainers:
        - name: create-s3-cache-dir
          image: busybox
          command:
            - sh
            - -c
            - |
              mkdir -p /tmp/s3-local-cache && \
              chmod 0700 /tmp/s3-local-cache
          securityContext:
            privileged: true
          volumeMounts:
            - name: host-mount
              mountPath: /tmp/s3-local-cache
      containers:
        - name: pause
          image: k8s.gcr.io/pause:3.1
      volumes:
        - name: host-mount
          hostPath:
            path: /tmp/s3-local-cache