cncf-tags / container-device-interface

Apache License 2.0
216 stars 39 forks source link

[Spec extension] Allow for IPC, NET, and CAP modifications to be specified in the CDI spec #55

Open elezar opened 2 years ago

elezar commented 2 years ago

We have a use case that would require the following (docker) arguments to work out the box:

--ipc host
--net host
--cap-add=IPC_LOCK

Does it make sense to extend the CDI spec to allow for IPC, NET, and CAP modifications to the OCI spec?

github-actions[bot] commented 6 months ago

This issue is stale because it has been open 90 days with no activity. This issue will be closed in 30 days unless new comments are made or the stale label is removed.

github-actions[bot] commented 5 months ago

This issue was automatically closed due to inactivity.

mythi commented 1 month ago

Does it make sense to extend the CDI spec to allow for IPC, NET, and CAP modifications to the OCI spec?

we would have a use-case with --cap-add=IPC_LOCK too so +1

elezar commented 1 month ago

@mythi let's reopen this then. Would you have time to expand on this / create a PR with the additions?

Note that one concern is that these changes would happen without the user knowing about them, so it may be good to limit the scope to use cases that don't provide too many additional privileges.

zvonkok commented 1 month ago

@elezar @mythi Just for understanding, CDI would be able to override what the Pod spec is configuring? What would happen in this case when run with docker,podman --net=host and CDI says something else?

mythi commented 1 month ago

I did not think of the details yet. My use case is with QAT devices that use VFIO framework. Pods requesting resources (DRA/device plugins) would also need to know to add IPC_LOCK to get the workload to work. The idea is that users would not need to know those details but the HW dependent settings would come through CDI.

zvonkok commented 1 month ago

I will throw in rlimit settings, since k8s is not yet able to set them properly, where one would need CAP_SYS_RESOURCE as well so it kinda fits in here as well. https://github.com/kubernetes/kubernetes/issues/3595

elezar commented 1 month ago

@elezar @mythi Just for understanding, CDI would be able to override what the Pod spec is configuring? What would happen in this case when run with docker,podman --net=host and CDI says something else?

That is what we would need to discuss / define for this issue. We would need to define the modifications that are to be made to the OCI runtime spec as well as determine whether there could be contradicting configurations.

elezar commented 1 month ago

Maybe splitting this into adding CAP_ADD support and the "host"-specifics would be useful. One option woudl be to add a REQUIRES_CAP check that would alert the user if the required caps are not specified?

klihub commented 1 month ago

Hmm... changing the IPC and network namespace seems a bit extreme. A failure to inject with an appropriate error message in a case where a device requires host IPC or network namespace but runs in a different one sounds more appropriate... as a last resort. Now I have not followed the latest directions DRA has taken, but when a CDI device reference is added by a DRA driver, it would sound more appropriate to already fail as early as possible, in the DRA driver.

kad commented 1 month ago

I will throw in rlimit settings, since k8s is not yet able to set them properly, where one would need CAP_SYS_RESOURCE as well so it kinda fits in here as well. kubernetes/kubernetes#3595

@zvonkok for rlimit, CDI probably not the best way to do it. you can use NRI plugin to inject it based on the need. See https://github.com/containerd/nri/pull/48 for details.