kubernetes-sigs / aws-encryption-provider

APIServer encryption provider, backed by AWS KMS
Apache License 2.0
208 stars 72 forks source link

minikube kubernetes api-server crashloop with aws-enryption-provider configured for AWS KMS. #45

Closed tapanhalani closed 4 years ago

tapanhalani commented 4 years ago

I am trying to use https://github.com/kubernetes-sigs/aws-encryption-provider on my local minikube k8s-1.16.2 cluster. After following the steps listed in this link, I am getting the following error when restarting minikube cluster :

==> kube-apiserver ["0d328d261494"] <== I1125 13:30:12.775356 1 cache.go:32] Waiting for caches to sync for AvailableConditionController controller I1125 13:30:12.775406 1 autoregister_controller.go:140] Starting autoregister controller I1125 13:30:12.775411 1 cache.go:32] Waiting for caches to sync for autoregister controller I1125 13:30:12.785198 1 controller.go:85] Starting OpenAPI controller I1125 13:30:12.785241 1 customresource_discovery_controller.go:208] Starting DiscoveryController I1125 13:30:12.785311 1 naming_controller.go:288] Starting NamingConditionController I1125 13:30:12.785507 1 establishing_controller.go:73] Starting EstablishingController I1125 13:30:12.785794 1 nonstructuralschema_controller.go:191] Starting NonStructuralSchemaConditionController I1125 13:30:12.785891 1 apiapproval_controller.go:185] Starting KubernetesAPIApprovalPolicyConformantConditionController E1125 13:30:12.797101 1 controller.go:154] Unable to remove old endpoints from kubernetes service: StorageError: key not found, Code: 1, Key: /registry/masterleases/192.168.99.102, ResourceVersion: 0, AdditionalErrorMsg: I1125 13:30:13.010620 1 crdregistration_controller.go:111] Starting crd-autoregister controller I1125 13:30:13.010914 1 shared_informer.go:197] Waiting for caches to sync for crd-autoregister I1125 13:30:13.010995 1 shared_informer.go:204] Caches are synced for crd-autoregister I1125 13:30:13.025833 1 controller.go:606] quota admission added evaluator for: leases.coordination.k8s.io I1125 13:30:13.081635 1 cache.go:39] Caches are synced for autoregister controller I1125 13:30:13.082063 1 cache.go:39] Caches are synced for APIServiceRegistrationController controller I1125 13:30:13.098277 1 cache.go:39] Caches are synced for AvailableConditionController controller I1125 13:30:13.772769 1 controller.go:107] OpenAPI AggregationController: Processing item I1125 13:30:13.772793 1 controller.go:130] OpenAPI AggregationController: action for item : Nothing (removed from the queue). I1125 13:30:13.772803 1 controller.go:130] OpenAPI AggregationController: action for item k8s_internal_local_delegation_chain_0000000000: Nothing (removed from the queue). I1125 13:30:13.783102 1 storage_scheduling.go:148] all system priority classes are created successfully or already exist. E1125 13:30:13.808370 1 grpc_service.go:71] failed to create connection to unix socket: /var/run/kmsplugin/socket.sock, error: dial unix /var/run/kmsplugin/socket.sock: connect: no such file or directory W1125 13:30:13.808653 1 clientconn.go:1120] grpc: addrConn.createTransport failed to connect to {/var/run/kmsplugin/socket.sock 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial unix /var/run/kmsplugin/socket.sock: connect: no such file or directory". Reconnecting... E1125 13:30:13.995827 1 grpc_service.go:71] failed to create connection to unix socket: /var/run/kmsplugin/socket.sock, error: dial unix /var/run/kmsplugin/socket.sock: connect: no such file or directory

Due to this, the api-server crashes continously, causing cluster to fail. But I have already deployed the kms-provider pod and verifiied the presence of /var/run/kmsplugin/socker.sock on the minikube host as follows:

`

ls -la /var/run/kmsplugin/

total 0 drwxr-xr-x 2 root root 60 Nov 25 13:15 . drwxr-xr-x 18 root root 600 Nov 25 13:15 .. srwxr-xr-x 1 root root 0 Nov 25 13:15 socket.sock `

It would be extremely helpful to understand what I might be doing wrong. Any help is highly appreciated. Thanks.

Techn0logic commented 4 years ago

I believe you might need to also mount it onto api-server with extraVolumes

apiVersion: kubeadm.k8s.io/v1beta1
kind: ClusterConfiguration
clusterName: "{{ cluster_name }}"
kubernetesVersion: {{ kube_version | regex_replace('-.*$', '') }}
apiServer:
  extraVolumes:
  - name: config
    hostPath: /etc/kubernetes/enc-config.yaml
    mountPath: /etc/kubernetes/enc-config.yaml
  - name: kmsplugin
    hostPath: /var/run/kmsplugin
    mountPath: /var/run/kmsplugin
  extraArgs:
    encryption-provider-config: /etc/kubernetes/enc-config.yaml

In this scenario I don't get

grpc_service.go:71] failed to create connection to unix socket: /var/run/kmsplugin/socket.sock, error: dial unix /var/run/kmsplugin/socket.sock: connect: no such file or directory
fejta-bot commented 4 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot commented 4 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten

fejta-bot commented 4 years ago

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close

k8s-ci-robot commented 4 years ago

@fejta-bot: Closing this issue.

In response to [this](https://github.com/kubernetes-sigs/aws-encryption-provider/issues/45#issuecomment-623512406): >Rotten issues close after 30d of inactivity. >Reopen the issue with `/reopen`. >Mark the issue as fresh with `/remove-lifecycle rotten`. > >Send feedback to sig-testing, kubernetes/test-infra and/or [fejta](https://github.com/fejta). >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
amitkatyal commented 3 years ago

Hi, I am trying to configure aws-encryption-provider with K3s (single node cluster) on my ubuntu machine. I've provided the encryption configuration to kube-apiserver but k3s server itself is not starting and keep on complaining with below error.

"failed to create connection to unix socket: /var/run/kmsplugin/socket.sock, error: dial unix /var/run/kmsplugin/socket.sock: connect: connection refused"

I'm just trying to understand the sequence,

Is Kube-ApiServer is expecting aws-encryption-provider (pod/service) to be up and running and listening on unix-domain socket ?. And if it is not running then kube-apiserver will not start ?

Since I am using single node cluster and kube-api-server is dependent on aws-encryption-pod, then just trying to understand how to fix it because unless k3s single node cluster is up, I can't run aws-encryption-provider. The other option I could think is of running aws-encryption-provider as docker container ?

Please let me know how did you fix it on minikube.

amentee commented 1 month ago

@amitkatyal - Were you able to fix this issue?