awslabs / mountpoint-s3-csi-driver

Built on Mountpoint for Amazon S3, the Mountpoint CSI driver presents an Amazon S3 bucket as a storage volume accessible by containers in your Kubernetes cluster.
Apache License 2.0
202 stars 24 forks source link

s3 mounts, but ls shows all question-marks #218

Open dariusj1 opened 3 months ago

dariusj1 commented 3 months ago

Hello,

I'm not sure whether this is aws-support-worthy, because from the aws services' perspective it should be working...

So, I installed the s3 CSI driver in my EKS cluster, created IAM roles as instructed in the README.md. Now, if I spin up a dummy pod in my kube-system namespace using my s3-csi-driver-sa, I can acess my S3 buckets using awscli just fine. I can create objects, delete, them, list them, etc. If I try to mount the same bucket using the s3-csi-driver, I'm getting no errors! The PV and PVC are created just fine, the pod referring to the PVC starts up too. However, if I try to ls -l the mounted directory, I see lots of question marks and a Permission denied.

[app_runner@cm-depl-856785559-c9xph tomcat]$ ls -l /hdump/
ls: cannot access /hdump/: Permission denied
[app_runner@cm-depl-856785559-c9xph tomcat]$ ls -l /
ls: cannot access /hdump: Permission denied
total 0
lrwxrwxrwx   1 root       root         7 May 29 18:09 bin -> usr/bin
dr-xr-xr-x   2 root       root         6 Apr  9  2019 boot
drwxr-xr-x   5 root       root       360 Jul  8 14:15 dev
drwxr-xr-x   1 app_runner app_runner  17 Jul  8 14:15 ep
drwxr-xr-x   1 root       root        41 Jul  8 14:15 etc
d?????????   ? ?          ?            ?            ? hdump
drwxr-xr-x   1 root       root        24 Jun 11 10:23 home

If I see into the CSI driver's logs, I see that the volume has been mounted for no more than 1 second, and then unmounted (?). No errors whatsoever.

kubectl -n kube-system logs s3-csi-node-tc7rr
Defaulted container "s3-plugin" out of: s3-plugin, node-driver-registrar, liveness-probe, install-mountpoint (init)
I0708 14:15:23.133712       1 driver.go:59] Driver version: 1.7.0, Git commit: 53b62cb27036138b46e51f34ddef454fd0f89c6c, build date: 2024-06-18T11:10:59Z, nodeID: ip-10-17-11-137.us-west-2.compute.internal, mount-s3 version: 1.7.2
I0708 14:15:23.143715       1 driver.go:79] Found AWS_WEB_IDENTITY_TOKEN_FILE, syncing token
I0708 14:15:23.144179       1 driver.go:109] Listening for connections on address: &net.UnixAddr{Name:"/csi/csi.sock", Net:"unix"}
I0708 14:15:24.182905       1 node.go:222] NodeGetInfo: called with args 
I0708 14:15:33.764189       1 node.go:206] NodeGetCapabilities: called with args 
I0708 14:15:33.765861       1 node.go:206] NodeGetCapabilities: called with args 
I0708 14:15:33.775212       1 node.go:206] NodeGetCapabilities: called with args 
I0708 14:15:33.779565       1 node.go:65] NodePublishVolume: req: volume_id:"hdump" target_path:"/var/lib/kubelet/pods/f8122c23-8cec-40a7-9a82-1914c1c84ed2/volumes/kubernetes.io~csi/pv-hdump/mount" volume_capability:<mount:<mount_flags:"allow-delete" mount_flags:"region us-west-2" > access_mode:<mode:MULTI_NODE_MULTI_WRITER > > volume_context:<key:"bucketName" value:"vol-my-app-stack-dev-hdump" > 
I0708 14:15:33.779768       1 node.go:112] NodePublishVolume: mounting vol-my-app-stack-dev-hdump at /var/lib/kubelet/pods/f8122c23-8cec-40a7-9a82-1914c1c84ed2/volumes/kubernetes.io~csi/pv-hdump/mount with options [--allow-delete --region=us-west-2]
I0708 14:15:33.933374       1 node.go:132] NodePublishVolume: /var/lib/kubelet/pods/f8122c23-8cec-40a7-9a82-1914c1c84ed2/volumes/kubernetes.io~csi/pv-hdump/mount was mounted
I0708 14:15:34.178575       1 node.go:162] NodeUnpublishVolume: called with args volume_id:"hdump" target_path:"/var/lib/kubelet/pods/1c611c39-c331-4ef1-8fb3-70751dcd30af/volumes/kubernetes.io~csi/pv-hdump/mount" 
I0708 14:15:34.191656       1 node.go:188] NodeUnpublishVolume: unmounting /var/lib/kubelet/pods/1c611c39-c331-4ef1-8fb3-70751dcd30af/volumes/kubernetes.io~csi/pv-hdump/mount
I0708 14:15:37.110173       1 node.go:206] NodeGetCapabilities: called with args 
I0708 14:17:27.792623       1 node.go:206] NodeGetCapabilities: called with args 

If I break into the EC2 node running the csi driver's pod, I can see the process is there:

root      212894       1  0 14:15 ?        00:00:00 /opt/mountpoint-s3-csi/bin/mount-s3 --allow-delete --region=us-west-2 --user-agent-prefix=s3-csi-driver/1.7.0 vol-my-app-stack-dev-hdump /var/lib/kubelet/pods/f8122c23-8cec-40a7-9a82-1914c1c84ed2/volumes/kubernetes.io~csi/pv-hdump/mount

(s3 bucket name redacted) /var/lib/kubelet/pods/f8122c23-8cec-40a7-9a82-1914c1c84ed2/volumes/kubernetes.io~csi/pv-hdump/mount directory is there, but it's empty. And it does not reflect any files I uploaded to that s3 manually, through the AWS Console.

I would consider IAM issues if I could not access the bucket using awscli from that namespace using the CSI SA, but since I CAN, I doubt the AWS Support would be able to advice.

Can you?

/triage support

dannycjones commented 3 months ago

Hey @dariusj1, thanks for reporting this issue with plenty of detail. You confirmed that the CSI driver did mount the file system and that the Mountpoint process is still running - that's great to confirm.

We need to learn more about what Mountpoint itself was doing and why we're seeing question marks when trying to interact with that FS.

Please can you fetch and share the logs from Mountpoint itself? You can learn more about how to fetch those in Mountpoint CSI Driver's logging documentation. If you're running the workload again, it would be useful to include debug as a mount option in the persistent volume spec.

dannycjones commented 3 months ago

I would consider IAM issues if I could not access the bucket using awscli from that namespace using the CSI SA, but since I CAN, I doubt the AWS Support would be able to advice.

Mountpoint's CSI driver (and Mountpoint) is backed by AWS Support, so please don't hesitate to reach out to them in future should you wish to do so.

dannycjones commented 3 months ago

@dariusj1 I just noticed this:

I see lots of question marks and a Permission denied.

I see that you're running under user app_runner. Does the issue go away when running as root?

I wonder if this is a case of needing to account for running the container application under a different user. We have an example spec file for that here: https://github.com/awslabs/mountpoint-s3-csi-driver/blob/main/examples/kubernetes/static_provisioning/non_root.yaml

It would be great if you could share your PV, PVC, and pod spec as well to understand a bit more. (Feel free to redact if needed)

dariusj1 commented 3 months ago

@dannycjones While in the host's filesystem, I created a test file in the /var/lib/kubelet/pods/f8122c23-8cec-40a7-9a82-1914c1c84ed2/volumes/kubernetes.io~csi/pv-hdump/mount directory. I only checked it just now, but the test file appeared in the S3 when I checked through the AWS console.

And now, as per your advice, I created a standalone pod running with root privileges, configured to mount that exact same S3 pvc. It seems that I can indeed see the created test file in the mounted /hdump directory!!

So it's a permission issue then...? I wonder why there's no error anywhere stating that; did I miss it...?

In the yaml you've referred me to, is this the part I'm missing then?

  mountOptions:
    - uid=1000
    - gid=2000
    - allow-other

It would be great if you could share your PV, PVC, and pod spec as well to understand a bit more. (Feel free to redact if needed)

Is this still needed?

dannycjones commented 3 months ago

And now, as per your advice, I created a standalone pod running with root privileges, configured to mount that exact same S3 pvc. It seems that I can indeed see the created test file in the mounted /hdump directory!!

So it's a permission issue then...? I wonder why there's no error anywhere stating that; did I miss it...?

In the yaml you've referred me to, is this the part I'm missing then?

  mountOptions:
    - uid=1000
    - gid=2000
    - allow-other

Yes, it seems like a permission issue - specifically at the Linux filesystem level.

By default, Mountpoint will have files with permission bits 0644 allowing reading and writing for the user running Mountpoint, and supposedly reading to others. There's a caveat here though that at the FS/FUSE level, we need to additionally 'allow other users' which includes specifying allow-other which is a Linux mount option. I believe you may also need to enable this in the OS configuration in /etc/fuse.conf, by adding user_allow_other in a new line if not already present. Let me know if that's needed. I will find a way to work the scenario in this ticket into documentation/troubleshooting.

We document the behavior with "allow other" in Mountpoint's config docs: https://github.com/awslabs/mountpoint-s3/blob/main/doc/CONFIGURATION.md#file-and-directory-permissions

Specifying the correct UID makes sure that the write permissions are granted to the correct user, so you should ensure that matches the UID used in the container.

Effectively, you need both of these sections:

https://github.com/awslabs/mountpoint-s3-csi-driver/blob/c357436ce179edd8b2d62a41cbe661ebbc74c0bf/examples/kubernetes/static_provisioning/non_root.yaml#L10-L13

https://github.com/awslabs/mountpoint-s3-csi-driver/blob/c357436ce179edd8b2d62a41cbe661ebbc74c0bf/examples/kubernetes/static_provisioning/non_root.yaml#L38-L40

dannycjones commented 3 months ago

It would be great if you could share your PV, PVC, and pod spec as well to understand a bit more. (Feel free to redact if needed)

Is this still needed?

I think we understand the issue now based on your testing. If you are happy to share it though once you've got a final working solution, it may be useful for anyone coming across this issue in future.

dariusj1 commented 3 months ago

@dannycjones

I've just tested and can confirm that adding

  mountOptions:
    - uid=1000
    - gid=2000
    - allow-other

resolved my access issue, thank you. What threw mw off is that the kubernetes pod spec does NOT specify securityContext. Instead, it's enforced at the OCI image level.

A few followup questions:

dannycjones commented 3 months ago
  • does that mean there actualy were no errors mounting the volume? Is that why I didn't see any?

Exactly, the volume was attached correctly and Mountpoint was running as expected. The error occurred at the Kernel level, as it will reject requests coming in for that FUSE file system where the user does not match the one running Mountpoint itself (without using --allow-other).

  • I must've misinterpreted the NodeUnpublishVolume part in the logs as "detaching the volume". What's it mean?

NodeUnpublishVolume is called by Kubernetes when we should detach that volume. For this CSI driver, that means unmounting the Mountpoint file system and cleaning up any other things related to the volume.

If you were able to jump on the node and access the FS, I'm not sure what's happened here. I'd expect that the file system should no longer be mounted and that directory to just be empty.

  • if I specify uid=1000 and gid=2000 and allow-other mountOptions, does that mean that containers running as, say, uid=5000 won't be able to write to that volume? Is there any way to bypass this restriction (the mounted directory mode is 0755 and files are 0644) and reuse the same s3 in containers running as different UIDs?

Yeah, you can update the file and directory modes. For example, you could include the mount options file-mode=0664 and dir-mode=0775 to grant write access to the group also. There's more explanation here: https://github.com/awslabs/mountpoint-s3/blob/main/doc/CONFIGURATION.md#file-and-directory-permissions.

dannycjones commented 3 months ago

What threw mw off is that the kubernetes pod spec does NOT specify securityContext. Instead, it's enforced at the OCI image level.

Forgive me, I'm still new to Kubernetes. You're sayings that your pod spec wasn't already specifying the runAs fields, but instead relying on something like the USER field in the Dockerfile/Containerfile? https://docs.docker.com/reference/dockerfile/#user

netikras commented 3 months ago

What threw mw off is that the kubernetes pod spec does NOT specify securityContext. Instead, it's enforced at the OCI image level.

Forgive me, I'm still new to Kubernetes. You're sayings that your pod spec wasn't already specifying the runAs fields, but instead relying on something like the USER field in the Dockerfile/Containerfile? https://docs.docker.com/reference/dockerfile/#user

Yes, that is the case. The OCI image is generated using vendor's shell scripts. I've just checked what's in there:

## ...
RUN adduser -u 10001 --user-group app_runner
## ...
USER app_runner
## ...

There's more explanation here: https://github.com/awslabs/mountpoint-s3/blob/main/doc/CONFIGURATION.md#file-and-directory-permissions.

thank you!!