awslabs / mountpoint-s3-csi-driver

Built on Mountpoint for Amazon S3, the Mountpoint CSI driver presents an Amazon S3 bucket as a storage volume accessible by containers in your Kubernetes cluster.
Apache License 2.0
153 stars 18 forks source link

s3-plugin exit code 2 #137

Open tomahkvt opened 5 months ago

tomahkvt commented 5 months ago

/kind bug

NOTE: If this is a filesystem related bug, please take a look at the Mountpoint repo to submit a bug report

What happened? We see quite often s3-csi-node- pods' stop with exit code 2.

2024-01-30 13:54:45.207 
problem_container_name=liveness-probe exit_code=2 problem_pod_name=s3-csi-node-rwzj5
2024-01-30 13:54:45.207 
problem_container_name=s3-plugin exit_code=2 problem_pod_name=s3-csi-node-rwzj5
2024-01-30 13:54:46.075 
problem_container_name=liveness-probe exit_code=2 problem_pod_name=s3-csi-node-rwzj5
2024-01-30 13:54:46.075 
problem_container_name=s3-plugin exit_code=2 problem_pod_name=s3-csi-node-rwzj5
2024-01-30 13:54:46.082 
problem_container_name=liveness-probe exit_code=2 problem_pod_name=s3-csi-node-rwzj5
2024-01-30 13:54:46.082 
problem_container_name=s3-plugin exit_code=2 problem_pod_name=s3-csi-node-rwzj5
2024-01-30 13:55:35.781 
problem_container_name=liveness-probe exit_code=2 problem_pod_name=s3-csi-node-xzlwh
2024-01-30 13:55:35.781 
problem_container_name=s3-plugin exit_code=2 problem_pod_name=s3-csi-node-xzlwh
2024-01-30 13:55:36.435 
problem_container_name=liveness-probe exit_code=2 problem_pod_name=s3-csi-node-xzlwh
2024-01-30 13:55:36.435 
problem_container_name=s3-plugin exit_code=2 problem_pod_name=s3-csi-node-xzlwh
2024-01-30 13:55:36.522 
problem_container_name=liveness-probe exit_code=2 problem_pod_name=s3-csi-node-xzlwh
2024-01-30 13:55:36.523 
problem_container_name=s3-plugin exit_code=2 problem_pod_name=s3-csi-node-xzlwh
2024-01-30 14:14:12.884 
problem_container_name=liveness-probe exit_code=2 problem_pod_name=s3-csi-node-hndkx
2024-01-30 14:14:12.884 
problem_container_name=s3-plugin exit_code=2 problem_pod_name=s3-csi-node-hndkx
2024-01-30 14:14:13.639 
problem_container_name=liveness-probe exit_code=2 problem_pod_name=s3-csi-node-hndkx
2024-01-30 14:14:13.639 
problem_container_name=s3-plugin exit_code=2 problem_pod_name=s3-csi-node-hndkx
2024-01-30 14:14:13.647 
problem_container_name=liveness-probe exit_code=2 problem_pod_name=s3-csi-node-hndkx
2024-01-30 14:14:13.647 
problem_container_name=s3-plugin exit_code=2 problem_pod_name=s3-csi-node-hndkx
2024-01-30 14:24:42.846 
problem_container_name=liveness-probe exit_code=2 problem_pod_name=s3-csi-node-56jpl
2024-01-30 14:24:42.846 
problem_container_name=s3-plugin exit_code=2 problem_pod_name=s3-csi-node-56jpl
2024-01-30 14:24:43.675 
problem_container_name=liveness-probe exit_code=2 problem_pod_name=s3-csi-node-56jpl
2024-01-30 14:24:43.675 
problem_container_name=s3-plugin exit_code=2 problem_pod_name=s3-csi-node-56jpl
2024-01-30 14:24:43.790 
problem_container_name=liveness-probe exit_code=2 problem_pod_name=s3-csi-node-56jpl
2024-01-30 14:24:43.790 
problem_container_name=s3-plugin exit_code=2 problem_pod_name=s3-csi-node-56jpl

What you expected to happen? We want to understand the reason pods restart.

How to reproduce it (as minimally and precisely as possible)? We use ARM image public.ecr.aws/mountpoint-s3-csi-driver/aws-mountpoint-s3-csi-driver:v1.2.0 repoURL: "https://awslabs.github.io/mountpoint-s3-csi-driver" targetRevision: 1.2.0 chart: aws-mountpoint-s3-csi-driver

Anything else we need to know?: We have this message in the pods logs: Received NotifyRegistrationStatus call: &RegistrationStatus{PluginRegistered:true,Error:,} Also, we noticed a memory leak

Environment

jjkr commented 5 months ago

I am unable to reproduce this. Any more information you can provide here will be helpful, especially:

tomahkvt commented 4 months ago

Hi @jjkr.