spiffe / spire

The SPIFFE Runtime Environment
https://spiffe.io
Apache License 2.0
1.72k stars 458 forks source link

Support kata-containers workload attestation #4531

Open Joffref opened 9 months ago

Joffref commented 9 months ago

As discussed in #4522, SPIRE cannot attest workloads inside kata-containers (microVM) as SPIRE relies on the host Kernel to attest workloads and containers are running onto several guest kernels it can't retrieve the associated selector.

Feature Request Be able to attest workloads inside kata-containers reliably.

Options considered As the PID and ContainerID are known by Kata-containers during the workload execution, we might leverage on this to map ContainerID, thus microVM, and PID. One risk is to collide PIDs as they're not managed by the same kernel.

That's my fast overview of the situation. If someone has thought about that question, feel free to share your insights.

Cc: @evan2645

evan2645 commented 8 months ago

Thank you for opening this @Joffref

As the PID and ContainerID are known by Kata-containers during the workload execution

Awesome ... looks like this is exposed over a host-accessible API? The notes say that the PID is as seen by the host so I don't think we have PID namespace issue between host/kata-container ... but, we would not have much visibility inside the kata-container beyond what is provided by the API. It sounds like a possible path forward

I think the next blocker will be how to physically mount the socket into the kata-container? Do you have any thoughts on how to accomplish that? We'd want the peercred on that socket read to match the PID of the kata-container for the agent to make sense of it

evan2645 commented 8 months ago

I think the next blocker will be how to physically mount the socket into the kata-container? Do you have any thoughts on how to accomplish that? We'd want the peercred on that socket read to match the PID of the kata-container for the agent to make sense of it

Hey @Joffref , quick ping back to see if you have any ideas on the above ... we'll close this issue out next week if we can't find a plausible path forward. Thanks in advance for any and all feedback 🙏

Joffref commented 8 months ago

Hey @evan2645 ! Sorry for the late reply, after digging a bit in Kata Specs definitions, it seems that it supports named pipes (aka UDS) inside Sandbox definition. Thus, I assume mounting a socket is a straightforward operation and it is natively supported by the runtime.

Let me create a design soon that shows what we discussed so far ! Then we'll be able to iterate over :)

evan2645 commented 7 months ago

That would be great, thank you @Joffref! I'll go ahead and move this into the backlog as unscoped

garygan89 commented 3 months ago

Thanks for this issue! I also encounter the same connection refused issue when trying to run spiffe-helper to access the host spire socket mounted through hostPath volume.

Right now I am able to use a SSH binding approach to bind the host UNIX socket to kata-runtime pod, then run spiffe-helper to initiate WorkloadAttestation. So far I could see the request going to SPIRE agent pod which is not possible before due to connection refused.

Following athe logs received at SPIRE agent:

time="2024-03-05T10:32:28Z" level=debug msg="PID attested to have selectors" pid=3349020 selectors="[type:\"unix\" value:\"uid:1000\" type:\"unix\" value:\"gid:1000\" type:\"unix\" value:\"supplementary_gid:27\" type:\"unix\" value:\"supplementary_gid:1000\"]" subsystem_name=workload_attestor
time="2024-03-05T10:32:28Z" level=error msg="No identity issued" method=FetchX509SVID pid=3349020 registered=false service=WorkloadAPI subsystem_name=endpoints

This approach only works to allow kata-runtime to send something to UNIX socket, however it uses the PID of sshd process rather than the Pod PID of that kata-runtime process on the host.

Not sure if there is any plan to put this into priority given that SPIFFE/SPIRE is gaining traction and we do see a lot of advantages if it can support virtualized workload as well! Thanks!

evan2645 commented 3 months ago

Thanks for the data point @garygan89 .. is the sshd process running inside the same pod as the kata container you're targeting?

My feeling here is that what we really want is first class support from the kata runtime. If it can attach to the workload api, and reflect that socket into the workload's sandbox, then we'd see the runtime shim and be able to properly attest it as such. Then on the SPIRE side, we might choose to pair that with a (new) kata workload attestor