minio / operator

Simple Kubernetes Operator for MinIO clusters :computer:
https://min.io/docs/minio/kubernetes/upstream/index.html
GNU Affero General Public License v3.0
1.21k stars 452 forks source link

Prometheus Pod Issue - CrashloopBackOff #660

Closed reddymh closed 2 years ago

reddymh commented 3 years ago

Prometheus pod is getting "Init:CrashLoopBackOff"

chown: /prometheus: Operation not permitted chown: /prometheus: Operation not permitted

Expected Behavior

Prometheus pod should be up and running.

Current Behavior

minio-prometheus-0 0/2 Init:CrashLoopBackOff 6 9m49s minio-ss-0-0 1/1 Running 0 9m49s minio-ss-0-1 1/1 Running 0 9m49s

Logs: chown: /prometheus: Operation not permitted chown: /prometheus: Operation not permitted

Possible Solution

Should work with specifying the service account which has cluster role binding.

harshavardhana commented 3 years ago

Prometheus pod is getting "Init:CrashLoopBackOff"

chown: /prometheus: Operation not permitted chown: /prometheus: Operation not permitted

Expected Behavior

Prometheus pod should be up and running.

Current Behavior

minio-prometheus-0 0/2 Init:CrashLoopBackOff 6 9m49s minio-ss-0-0 1/1 Running 0 9m49s minio-ss-0-1 1/1 Running 0 9m49s

Logs: chown: /prometheus: Operation not permitted chown: /prometheus: Operation not permitted

Possible Solution

Should work with specifying the service account which has cluster role binding.

Looks like you have a security context that disallows chown operation from init container.

What permissions have you specified ?

reddymh commented 3 years ago

I tried with the below securityContext and without securityContext but getting same error:

securityContext:
  runAsUser: 65534
  runAsNonRoot: true
  runAsGroup: 65534
stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

edaveau commented 3 years ago

Hi, I'm facing the same issue. I installed MinIO operator using the krew plugin, and my volumes are provided by the direct-csi plugin (also installed with krew). When I create a tenant using the Operator Console, all is well until the creation of the Prometheus pod which displays the CrashLoopBackOff error. The sidecar is ok, but the prometheus container fails with the following logs :

level=error ts=2021-09-01T12:21:18.743Z caller=query_logger.go:87 component=activeQueryTracker msg="Error opening query log file" file=/prometheus/queries.active err="open /prometheus/queries.active: permission denied"
panic: Unable to create mmap-ed active query log

I did a little research, and got the container working by changing the statefulset permissions like so :

securityContext:
          runAsGroup: 2000
          runAsNonRoot: false #Changed from true
          runAsUser: 0 #Changed from 1000

After this, the pod runs and the tenant is Initialized. However, I'd rather get the container to work as non root. My concern is when I look at the /etc/passwd, I don't see any account with PID 1000.

Also, the rights on /prometheus/queries.active are set like this : -rw-r--r-- 1 root 2000 20001 Sep 2 07:03 queries.active

Finally, if I run theid -Gn command, here's the output : 2000id: unknown ID 2000

So my impression is that the container fails because it tries to ask a user with PID 1000 and GID 2000 to read the /prometheus/queries.active file when there is no trace of this user or group in the container. Am I missing something here ?

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.