spiffe / spire-tutorials

Apache License 2.0
86 stars 81 forks source link

Spire Agent fail to start on minikube #121

Open rushi47 opened 1 year ago

rushi47 commented 1 year ago

Hello Team 👋 ,

When following tutorial : https://spiffe.io/docs/latest/try/getting-started-k8s/ using minikube it fails on spire-agent. Below are the logs of from spire-agent :

time="2023-07-18T19:40:23Z" level=warning msg="Current umask 0022 is too permissive; setting umask 0027"
time="2023-07-18T19:40:23Z" level=info msg="Starting agent with data directory: \"/run/spire\""
time="2023-07-18T19:40:23Z" level=info msg="Plugin loaded" external=false plugin_name=k8s_sat plugin_type=NodeAttestor subsystem_name=catalog
time="2023-07-18T19:40:23Z" level=info msg="Plugin loaded" external=false plugin_name=memory plugin_type=KeyManager subsystem_name=catalog
time="2023-07-18T19:40:23Z" level=info msg="Plugin loaded" external=false plugin_name=k8s plugin_type=WorkloadAttestor subsystem_name=catalog
time="2023-07-18T19:40:23Z" level=info msg="Plugin loaded" external=false plugin_name=unix plugin_type=WorkloadAttestor subsystem_name=catalog
time="2023-07-18T19:40:23Z" level=info msg="Bundle loaded" subsystem_name=attestor trust_domain_id="spiffe://example.org"
time="2023-07-18T19:40:23Z" level=debug msg="No pre-existing agent SVID found. Will perform node attestation" subsystem_name=attestor
time="2023-07-18T19:40:23Z" level=info msg="SVID is not found. Starting node attestation" subsystem_name=attestor trust_domain_id="spiffe://example.org"
time="2023-07-18T19:40:24Z" level=info msg="Node attestation was successful" rettestable=false spiffe_id="spiffe://example.org/spire/agent/k8s_sat/demo-cluster/953c6aa5-c9ab-4cf8-8903-cdb726fbad39" subsystem_name=attestor trust_domain_id="spiffe://example.org"
time="2023-07-18T19:40:24Z" level=debug msg="Bundle added" subsystem_name=svid_store_cache trust_domain_id=example.org
time="2023-07-18T19:40:24Z" level=info msg="Starting Workload and SDS APIs" address=/run/spire/sockets/agent.sock network=unix subsystem_name=endpoints
time="2023-07-18T19:40:25Z" level=warning msg="Connection failed during accept" error="could not read caller stat: open /proc/40418/stat: no such file or directory" subsystem_name=endpoints
time="2023-07-18T19:40:27Z" level=warning msg="Connection failed during accept" error="could not read caller stat: open /proc/40418/stat: no such file or directory" subsystem_name=endpoints
time="2023-07-18T19:40:29Z" level=warning msg="Connection failed during accept" error="could not read caller stat: open /proc/40418/stat: no such file or directory" subsystem_name=endpoints
time="2023-07-18T19:40:29Z" level=debug msg="Entry created" entry=ce46a935-f26d-4d3c-af1d-36fcf249ce79 selectors_added=3 spiffe_id="spiffe://example.org/ns/spire/sa/spire-agent" subsystem_name=cache_manager
time="2023-07-18T19:40:29Z" level=debug msg="Renewing stale entries" cache_type=workload count=1 limit=500 subsystem_name=manager
time="2023-07-18T19:40:29Z" level=info msg="Renewing X509-SVID" spiffe_id="spiffe://example.org/ns/spire/sa/spire-agent" subsystem_name=manager
time="2023-07-18T19:40:29Z" level=debug msg="SVID updated" entry=ce46a935-f26d-4d3c-af1d-36fcf249ce79 spiffe_id="spiffe://example.org/ns/spire/sa/spire-agent" subsystem_name=cache_manager
time="2023-07-18T19:40:33Z" level=warning msg="Connection failed during accept" error="could not read caller stat: open /proc/40418/stat: no such file or directory" subsystem_name=endpoints
time="2023-07-18T19:40:40Z" level=warning msg="Connection failed during accept" error="could not read caller stat: open /proc/40418/stat: no such file or directory" subsystem_name=endpoints
time="2023-07-18T19:40:50Z" level=warning msg="Connection failed during accept" error="could not read caller stat: open /proc/40418/stat: no such file or directory" subsystem_name=endpoints
time="2023-07-18T19:40:54Z" level=debug msg="Initializing health checkers" subsystem_name=health
time="2023-07-18T19:40:54Z" level=info msg="Serving health checks" address="0.0.0.0:8080" subsystem_name=health
time="2023-07-18T19:40:54Z" level=warning msg="Connection failed during accept" error="could not read caller stat: open /proc/40418/stat: no such file or directory" subsystem_name=endpoints
time="2023-07-18T19:40:54Z" level=error msg="Health check has failed" check=agent error="subsystem is not live or ready" subsystem_name=health
time="2023-07-18T19:40:54Z" level=warning msg="Health check failed" check=agent details="{false false {workload api is unavailable} {workload api is unavailable}}" error="subsystem is not live or ready" subsystem_name=health
time="2023-07-18T19:41:54Z" level=warning msg="Connection failed during accept" error="could not read caller stat: open /proc/40418/stat: no such file or directory" subsystem_name=endpoints
time="2023-07-18T19:41:54Z" level=error msg="Health check has failed" check=agent error="subsystem is not live or ready" subsystem_name=health
time="2023-07-18T19:42:23Z" level=debug msg="Stopping SVID rotator" subsystem_name=manager
time="2023-07-18T19:42:23Z" level=debug msg="Finishing health checker" subsystem_name=health
time="2023-07-18T19:42:23Z" level=info msg="Stopping Workload and SDS APIs" subsystem_name=endpoints
time="2023-07-18T19:42:23Z" level=info msg="Cache manager stopped" subsystem_name=manager
time="2023-07-18T19:42:23Z" level=debug msg="Closing catalog" subsystem_name=catalog
time="2023-07-18T19:42:23Z" level=debug msg="Unloading plugin" external=false plugin_name=unix plugin_type=WorkloadAttestor subsystem_name=catalog
time="2023-07-18T19:42:23Z" level=debug msg="Plugin deinitialized" external=false plugin_name=unix plugin_type=WorkloadAttestor subsystem_name=catalog
time="2023-07-18T19:42:23Z" level=info msg="Plugin unloaded" external=false plugin_name=unix plugin_type=WorkloadAttestor subsystem_name=catalog
time="2023-07-18T19:42:23Z" level=debug msg="Unloading plugin" external=false plugin_name=k8s plugin_type=WorkloadAttestor subsystem_name=catalog
time="2023-07-18T19:42:23Z" level=debug msg="Plugin deinitialized" external=false plugin_name=k8s plugin_type=WorkloadAttestor subsystem_name=catalog
time="2023-07-18T19:42:23Z" level=info msg="Plugin unloaded" external=false plugin_name=k8s plugin_type=WorkloadAttestor subsystem_name=catalog
time="2023-07-18T19:42:23Z" level=debug msg="Unloading plugin" external=false plugin_name=memory plugin_type=KeyManager subsystem_name=catalog
time="2023-07-18T19:42:23Z" level=debug msg="Plugin deinitialized" external=false plugin_name=memory plugin_type=KeyManager subsystem_name=catalog
time="2023-07-18T19:42:23Z" level=info msg="Plugin unloaded" external=false plugin_name=memory plugin_type=KeyManager subsystem_name=catalog
time="2023-07-18T19:42:23Z" level=debug msg="Unloading plugin" external=false plugin_name=k8s_sat plugin_type=NodeAttestor subsystem_name=catalog
time="2023-07-18T19:42:23Z" level=debug msg="Plugin deinitialized" external=false plugin_name=k8s_sat plugin_type=NodeAttestor subsystem_name=catalog
time="2023-07-18T19:42:23Z" level=info msg="Plugin unloaded" external=false plugin_name=k8s_sat plugin_type=NodeAttestor subsystem_name=catalog
time="2023-07-18T19:42:23Z" level=info msg="Catalog closed" subsystem_name=catalog
time="2023-07-18T19:42:23Z" level=info msg="Agent stopped gracefully"

Am using minikube version :

minikube version: v1.30.1
commit: 08896fd1dc362c097c925146c4a0d0dac715ace0

Kube version :

Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.9", GitCommit:"a1a87a0a2bcd605820920c6b0e618a8ab7d117d4", GitTreeState:"clean", BuildDate:"2023-04-12T12:16:51Z", GoVersion:"go1.19.8", Compiler:"gc", Platform:"darwin/arm64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.3", GitCommit:"9e644106593f3f4aa98f8a84b23db5fa378900bd", GitTreeState:"clean", BuildDate:"2023-03-15T13:33:12Z", GoVersion:"go1.19.7", Compiler:"gc", Platform:"linux/arm64"}

I am on mac m1 pro if it helps.

Darwin x-MacBook-Pro.local 22.5.0 Darwin Kernel Version 22.5.0: Thu Jun  8 22:22:20 PDT 2023; root:xnu-8796.121.3~7/RELEASE_ARM64_T6000 arm64

Thank you.

ksnavely commented 3 weeks ago

I've hit the same issue with /proc/X/stat on a kind-based setup attempting to start the spire agent.

amandilpp commented 6 days ago

I am getting same error where Spire-agent fails at health check -

level=info msg="Serving health checks" address="0.0.0.0:8080" subsystem_name=health level=warning msg="Connection failed during accept" error="could not read caller stat: open /proc/x/stat: no such file or directory" subsystem_name=endpoints level=error msg="Health check has failed" check=agent error="subsystem is not live or ready" subsystem_name=health level=warning msg="Health check failed" check=agent details="{false false {workload api is unavailable} {workload api is unavailable}}" error="subsystem is not live or ready" subsystem_name=health level=warning msg="Connection failed during accept" error="could not read caller stat: open /proc/x/stat: no such file or directory" subsystem_name=endpoints @ksnavely - Have you resolved your issue?

ksnavely commented 6 days ago

@amandilpp I was able to resolve my issue. I do not have specific guidance to provide, but I had success after upgrading to Mac OS Sonoma 14.1.1, Docker Desktop 4.34.2, kind 0.22.0 (Apple M1). We have a complex Kubernetes stack and I was able to succeed after the upgrades. I'm not sure if one of these specifically did the trick.