LangStream / langstream

LangStream. Event-Driven Developer Platform for Building and Running LLM AI Apps. Powered by Kubernetes and Kafka.
https://langstream.ai
Apache License 2.0
378 stars 28 forks source link

Provisioned disks for custom agent permission denied #741

Closed Dobosz closed 6 months ago

Dobosz commented 6 months ago

Custom agent running on stateful set mounts volume with 755 and is owned by id=0. Since container is running on user id=10000 there is no write permission on mounted disk. I understand this is not desired behaviour.

Agent description:

  - name: "Google Drive Source"
    id: "google-drive-src"
    type: "python-source"
    configuration:
      className: "application.GoogleDriveFileLangChain"
      driveId: "[redacted]"
      idleTime: 60
      pageSize: 20
      environment:
        - key: "DRIVE_CREDENTIALS"
          value: "${secrets.drive.drive_credentials}"
    resources:
      disk:
        enabled: true
        size: 10M
        type: "google-drive-src"

Runtime

My runtime is GCP GKE cluster version 1.27.5-gke.200 running on autopilot. The default storage class is as follows:

allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"allowVolumeExpansion":true,"apiVersion":"storage.k8s.io/v1","kind":"StorageClass","metadata":{"annotations":{},"name":"default"},"parameters":{"type":"pd-standard"},"provisioner":"kubernetes.io/gce-pd","reclaimPolicy":"Delete","volumeBindingMode":"WaitForFirstConsumer"}
  creationTimestamp: "2023-12-08T14:55:37Z"
  name: default
  resourceVersion: "1854463"
  uid: 8706e67d-3cdd-414b-822c-eae6e265153a
parameters:
  type: pd-standard
provisioner: kubernetes.io/gce-pd
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer

It's probably a GKE specific issue, but I've not tested it yet on other runtime yet.

nicoloboschi commented 6 months ago

Probably related to https://stackoverflow.com/questions/46873796/allowing-access-to-a-persistentvolumeclaim-to-non-root-user

Dobosz commented 6 months ago

I can confirm that setting fsGroup on pod's security context solves the issue. I can PR this, but it's important to note it's dependent on the image's id used in runtime.

nicoloboschi commented 6 months ago

@Dobosz I saw you pushed the commit in your fork, can you open a PR? the change LGTM and I'd do the same fix