kubernetes / minikube

Run Kubernetes locally
https://minikube.sigs.k8s.io/
Apache License 2.0
29.32k stars 4.88k forks source link

storage-provisioner addon: kube-system:storage-provisioner cannot list events in the namespace #3129

Closed bodom0015 closed 4 years ago

bodom0015 commented 6 years ago

Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT

Please provide the following details:

Environment: minikube v0.28.2 on macOS 10.13.2 + VirtualBox

Minikube version (use minikube version): minikube version: v0.28.2

What happened: My minikube cluster (created yesterday) with the storage-provisioner addon enabled.

At first, I was apparently in a bad state: kubectl describe pvc yielded the familiar "the provisioner hasn't worked yet" warning message, and the provisioner logs were complaining about some unknown connectivity issue:

$ kubectl get sc
NAME                 PROVISIONER                AGE
standard (default)   k8s.io/minikube-hostpath   40m

$ kubectl get pvc,pv -n test
NAME                  STATUS    VOLUME    CAPACITY   ACCESS MODES   STORAGECLASS   AGE
pvc/shr86q-cloudcmd   Pending                                       standard       20s
pvc/sz8kmm-cloudcmd   Pending                                       standard       1m

$ kubectl describe pvc/shr86q-cloudcmd -n test
Name:          shr86q-cloudcmd
Namespace:     test
StorageClass:  standard
Status:        Pending
Volume:        
Labels:        name=shr86q-cloudcmd
               service=cloudcmd
               stack=shr86q
Annotations:   volume.beta.kubernetes.io/storage-provisioner=k8s.io/minikube-hostpath
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      
Access Modes:  
Events:
  Type    Reason                Age                From                         Message
  ----    ------                ----               ----                         -------
  Normal  ExternalProvisioning  14s (x4 over 35s)  persistentvolume-controller  waiting for a volume to be created, either by external provisioner "k8s.io/minikube-hostpath" or manually created by system administrator

$ kubectl get pods -n kube-system
NAME                                    READY     STATUS             RESTARTS   AGE
etcd-minikube                           1/1       Running            0          40m
kube-addon-manager-minikube             1/1       Running            0          41m
kube-apiserver-minikube                 1/1       Running            0          41m
kube-controller-manager-minikube        1/1       Running            0          41m
kube-scheduler-minikube                 1/1       Running            0          41m
kubernetes-dashboard-5498ccf677-5r975   0/1       CrashLoopBackOff   11         41m
storage-provisioner                     0/1       CrashLoopBackOff   11         41m

$ kubectl logs -f storage-provisioner -n kube-system
F0912 16:43:12.951200       1 main.go:37] Error getting server version: Get https://10.96.0.1:443/version: dial tcp 10.96.0.1:443: i/o timeout

Upon deleting and recreating the minikube cluster (to clear the bad state), when repeating the test case I saw the following in the logs:

$ kubectl logs -f storage-provisioner -n kube-system
Error watching for provisioning success, can't provision for claim "test/s4rdfk-cloudcmd": events is forbidden: User "system:serviceaccount:kube-system:storage-provisioner" cannot list events in the namespace "test"
Error watching for provisioning success, can't provision for claim "test/spd9xt-cloudcmd": events is forbidden: User "system:serviceaccount:kube-system:storage-provisioner" cannot list events in the namespace "test"

The provisioner did still create a PV and Bound the PVC to it in such cases:

$ kubectl get pvc -n test
NAME              STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
s4rdfk-cloudcmd   Bound     pvc-e794fa3e-b6ac-11e8-8044-080027add193   1Mi        RWX            standard       12m
spd9xt-cloudcmd   Bound     pvc-6baa04ab-b6ad-11e8-8044-080027add193   1Mi        RWX            standard       8m
src67q-cloudcmd   Bound     pvc-2243a82c-b6ae-11e8-8044-080027add193   1Mi        RWX            standard       3m

What you expected to happen: The provisioner shouldn't throw an error when provisioning was successful.

How to reproduce it (as minimally and precisely as possible):

  1. Bring up a fresh cluster with default addons enabled: minikube start
  2. Fetch a test PVC template: wget https://gist.githubusercontent.com/bodom0015/d920e22df8ff78ee05929d4c3ae736f8/raw/edccc530bf6fa748892d47130a1311fce5513f37/test.pvc.default.yaml
  3. Create a PVC from the template:kubectl create -f test.pvc.default.yaml
  4. After a few seconds, check on your PVC: kubectl get pvc
    • You should see that after a few seconds, your PVC is Bound to a PV
  5. Check the storage-provisioner logs

Output of minikube logs (if applicable): minikube logs did not seem to yield any pertinent debugging information, but the storage-provisioner pod logs did yield the following error message:

$ kubectl logs -f storage-provisioner -n kube-system
E0912 16:57:17.134782       1 controller.go:682] Error watching for provisioning success, can't provision for claim "test/s4rdfk-cloudcmd": events is forbidden: User "system:serviceaccount:kube-system:storage-provisioner" cannot list events in the namespace "test"
E0912 17:00:58.710095       1 controller.go:682] Error watching for provisioning success, can't provision for claim "test/spd9xt-cloudcmd": events is forbidden: User "system:serviceaccount:kube-system:storage-provisioner" cannot list events in the namespace "test"

Anything else do we need to know: As a temporary manual workaround, the following seemed to work:

# Edit to add the "list" verb to the "events" resource
$ kubectl edit clusterrole -n kube-system system:persistent-volume-provisioner
johnraz commented 5 years ago

I observe this behavior on a "no driver" install as well.

abdennour commented 5 years ago

Any update on this issue?

tstromberg commented 5 years ago

Sorry that this isn't working for you. I'm not familiar yet with PVC, so I'm a little unclear how to replicate. When I run:

minikube start --vm-driver=kvm2
minikube addons enable storage-provisioner
kubectl get sc

With kvm2 I get: No resources found.

With VirtualBox and macOS I get a little further:

$ kubectl get sc --all-namespaces                                                                                                                       NAME                 PROVISIONER                AGE
standard (default)   k8s.io/minikube-hostpath   1m

$ kubectl get pvc,pv -n test                                                                                                                            No resources found.

What am I missing?

pecastro commented 5 years ago

+1 minikube version: v0.35.0 with --vm driver=kvm2

Pupix commented 5 years ago

In my case minikube worked well until I stopped it. Now it fails at startup, the VM is running but won't finish configuring.

Anyone knows where in the VM the config files are stored so I can manually edit them?

@tstromberg I installed a DB, in my case RethinkDB, with tiller/helm. Wait for it to install and provision everything.

After VM reboot I keep getting:

==> storage-provisioner <==
E0417 07:58:44.947376       1 controller.go:682] Error watching for provisioning success, can't provision for claim "dev/datadir-rethinkdb-rethinkdb-cluster-0": events is forbidden: User "system:serviceaccount:kube-system:storage-provisioner" cannot list resource "events" in API group "" in the namespace "dev"
mmazur commented 5 years ago

Still present…

fejta-bot commented 5 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot commented 4 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten

QwertyJack commented 4 years ago

/remove-lifecycle rotten

QwertyJack commented 4 years ago

Same here, macOS & VirtualBox.

bodom0015 commented 4 years ago

Steps to Reproduce

Here is the simplest reproduction of this bug.

Step 1: minikube start and verify cluster has started up

$ minikube start
⚠️  minikube 1.5.2 is available! Download it: https://github.com/kubernetes/minikube/releases/tag/v/1.5.2
πŸ’‘  To disable this notice, run: 'minikube config set WantUpdateNotification false'
πŸ˜„  minikube v1.3.1 on Darwin 10.13.2
πŸ’‘  Tip: Use 'minikube start -p <name>' to create a new cluster, or 'minikube delete' to delete this one.
πŸ”„  Starting existing virtualbox VM for "minikube" ...
βŒ›  Waiting for the host to be provisioned ...
🐳  Preparing Kubernetes v1.15.2 on Docker 18.09.8 ...
πŸ”„  Relaunching Kubernetes using kubeadm ... 
βŒ›  Waiting for: apiserver proxy etcd scheduler controller dns
πŸ„  Done! kubectl is now configured to use "minikube"

$ kubectl get pods -A
NAMESPACE     NAME                                    READY   STATUS    RESTARTS   AGE
default       vault-agent-example                     2/2     Running   0          19d
kube-system   coredns-5c98db65d4-9rrhb                1/1     Running   1          19d
kube-system   coredns-5c98db65d4-d2rmk                1/1     Running   1          19d
kube-system   etcd-minikube                           1/1     Running   0          19d
kube-system   kube-addon-manager-minikube             1/1     Running   0          19d
kube-system   kube-apiserver-minikube                 1/1     Running   0          19d
kube-system   kube-controller-manager-minikube        1/1     Running   0          13s
kube-system   kube-proxy-fsqv7                        1/1     Running   0          19d
kube-system   kube-scheduler-minikube                 1/1     Running   0          19d
kube-system   kubernetes-dashboard-7b8ddcb5d6-ll8w7   1/1     Running   0          19d
kube-system   storage-provisioner                     1/1     Running   0          19d
kube-system   tiller-deploy-597567bdfd-pctlg          1/1     Running   0          19d

Step 2: create any PVC - note that it does successfully provision and bind to a PV

$ kubectl apply -f https://gist.githubusercontent.com/bodom0015/d920e22df8ff78ee05929d4c3ae736f8/raw/edccc530bf6fa748892d47130a1311fce5513f37/test.pvc.default.yaml
persistentvolumeclaim/test created

$ kubectl get pvc,pv
NAME                         STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
persistentvolumeclaim/test   Bound    pvc-fa9c1a0d-df76-4931-9ce5-1cfe4f0375eb   1Mi        RWX            standard       4s

NAME                                                        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM          STORAGECLASS   REASON   AGE
persistentvolume/pvc-fa9c1a0d-df76-4931-9ce5-1cfe4f0375eb   1Mi        RWX            Delete           Bound    default/test   standard                3s

Step 3: Check the provisioner logs to see the error message

$ kubectl logs -f storage-provisioner -n kube-system
E1118 16:45:27.950319       1 controller.go:682] Error watching for provisioning success, can't provision for claim "default/test": events is forbidden: User "system:serviceaccount:kube-system:storage-provisioner" cannot list resource "events" in API group "" in the namespace "default"

The Problem

This error, while innocuous, indicates that the built-in ServiceAccount named system:persistent-volume-provisioner that is used by the storage-provisioner Pod is missing one or more required permissions, namely the ability to list the events resource.

Possible Fix

If this permission is needed in more cases than not, then the correct way to fix this might be to create a PR back to kubeadm (or the appropriate Kubernetes repo) that will add the missing permission to the system:persistent-volume-provisioner ClusterRole.

A simple way to fix this in the short-term would be to create a thin ClusterRole (or a full copy of system:persistent-volume-provisioner) that would grant the missing permission. This could possibly go here:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: system:persistent-volume-provisioner-supl
rules:
- apiGroups:
  - ""
  resources:
  - events
  verbs:
  - list
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: storage-provisioner-supl
  labels:
    addonmanager.kubernetes.io/mode: EnsureExists
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:persistent-volume-provisioner-supl
subjects:
  - kind: ServiceAccount
    name: storage-provisioner
    namespace: kube-system
fejta-bot commented 4 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

medyagh commented 4 years ago

@bodom0015 thank you for updating this issue, I am curious if this issue still exist in 1.7.3 ? we have done some changes to the addon system since then

bodom0015 commented 4 years ago

@medyagh I can confirm that this still happens in v1.8.1 with the exact same error message:

# Clear out old state
wifi-60-235:universal lambert8$ minikube delete
πŸ™„  "minikube" profile does not exist, trying anyways.
πŸ’€  Removed all traces of the "minikube" cluster.

# Update to newest minikube
wifi-60-235:universal lambert8$ minikube version
minikube version: v1.8.1
commit: cbda04cf6bbe65e987ae52bb393c10099ab62014
wifi-60-235:universal lambert8$ minikube update-check
CurrentVersion: v1.8.1
LatestVersion: v1.8.1

# Start new Minikube cluster using v1.8.1
wifi-60-235:universal lambert8$ minikube start
πŸ˜„  minikube v1.8.1 on Darwin 10.13.2
✨  Automatically selected the hyperkit driver
πŸ’Ύ  Downloading driver docker-machine-driver-hyperkit:
    > docker-machine-driver-hyperkit.sha256: 65 B / 65 B [---] 100.00% ? p/s 0s
    > docker-machine-driver-hyperkit: 10.90 MiB / 10.90 MiB  100.00% 39.88 MiB 
πŸ”‘  The 'hyperkit' driver requires elevated permissions. The following commands will be executed:

    $ sudo chown root:wheel /Users/lambert8/.minikube/bin/docker-machine-driver-hyperkit 
    $ sudo chmod u+s /Users/lambert8/.minikube/bin/docker-machine-driver-hyperkit 

πŸ’Ώ  Downloading VM boot image ...
    > minikube-v1.8.0.iso.sha256: 65 B / 65 B [--------------] 100.00% ? p/s 0s
    > minikube-v1.8.0.iso: 173.56 MiB / 173.56 MiB [-] 100.00% 36.15 MiB p/s 5s
πŸ”₯  Creating hyperkit VM (CPUs=2, Memory=2200MB, Disk=20000MB) ...
πŸ’Ύ  Downloading preloaded images tarball for k8s v1.17.3 ...
    > preloaded-images-k8s-v1-v1.17.3-docker-overlay2.tar.lz4: 499.26 MiB / 499
🐳  Preparing Kubernetes v1.17.3 on Docker 19.03.6 ...
πŸš€  Launching Kubernetes ... 
🌟  Enabling addons: default-storageclass, storage-provisioner
βŒ›  Waiting for cluster to come online ...
πŸ„  Done! kubectl is now configured to use "minikube"
⚠️  /usr/local/bin/kubectl is version 1.15.3, and is incompatible with Kubernetes 1.17.3. You will need to update /usr/local/bin/kubectl or use 'minikube kubectl' to connect with this cluster

# Verify cluster is ready
wifi-60-235:universal lambert8$ kubectl get pods -A
NAMESPACE     NAME                          READY   STATUS              RESTARTS   AGE
kube-system   coredns-6955765f44-kczhj      0/1     ContainerCreating   0          10s
kube-system   coredns-6955765f44-v6x8n      0/1     Running             0          10s
kube-system   etcd-m01                      1/1     Running             0          14s
kube-system   kube-apiserver-m01            1/1     Running             0          14s
kube-system   kube-controller-manager-m01   1/1     Running             0          14s
kube-system   kube-proxy-n7mhx              1/1     Running             0          10s
kube-system   kube-scheduler-m01            1/1     Running             0          14s
kube-system   storage-provisioner           1/1     Running             0          14s

# Create a Test PVC
wifi-60-235:universal lambert8$ kubectl apply -f https://gist.githubusercontent.com/bodom0015/d920e22df8ff78ee05929d4c3ae736f8/raw/edccc530bf6fa748892d47130a1311fce5513f37/test.pvc.default.yaml
persistentvolumeclaim/test created

# Check the storage-provisioner logs
wifi-60-235:universal lambert8$ kubectl logs -f storage-provisioner -n kube-system
E0309 17:58:24.988551       1 controller.go:682] Error watching for provisioning success, can't provision for claim "default/test": events is forbidden: User "system:serviceaccount:kube-system:storage-provisioner" cannot list resource "events" in API group "" in the namespace "default"
fejta-bot commented 4 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten

medyagh commented 4 years ago

+1 minikube version: v0.35.0 with --vm driver=kvm2

your minikube version is very old, do you mind trying with a newer minikube version ?

fejta-bot commented 4 years ago

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close

k8s-ci-robot commented 4 years ago

@fejta-bot: Closing this issue.

In response to [this](https://github.com/kubernetes/minikube/issues/3129#issuecomment-633204974): >Rotten issues close after 30d of inactivity. >Reopen the issue with `/reopen`. >Mark the issue as fresh with `/remove-lifecycle rotten`. > >Send feedback to sig-testing, kubernetes/test-infra and/or [fejta](https://github.com/fejta). >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
ctron commented 4 years ago

This is still present in:

minikube version: v1.11.0
commit: 57e2f55f47effe9ce396cea42a1e0eb4f611ebbd
aphofstede commented 4 years ago

/remove-lifecycle rotten

kmr0877 commented 4 years ago

@ctron Did you happen to find any hack to get away with this issue? I am on minikube version v1.11.0 as well and stcuk with this issue

ctron commented 4 years ago

@ctron Did you happen to find any hack to get away with this issue? I am on minikube version v1.11.0 as well and stcuk with this issue

Nope. My "hack" was to go back to version v1.8.2. However, I didn't test that for a while.

afbjorklund commented 4 years ago

Did you try the new storage-provisioner (v3), see if that helps with the issue ?

gcr.io/k8s-minikube/storage-provisioner

slaskawi commented 4 years ago

This issue still happens on Minikube v1.12.3. Perhaps we could this ticket?

kadern0 commented 4 years ago

I'm having some issues with storage-provisioner on:

minikube version: v1.14.0
commit: b09ee50ec047410326a85435f4d99026f9c4f5c4
oleg-andreyev commented 3 years ago

Still present

minikube version: v1.23.1
commit: 84d52cd81015effbdd40c632d9de13db91d48d43