kubernetes / kubernetes

Production-Grade Container Scheduling and Management
https://kubernetes.io
Apache License 2.0
111.04k stars 39.65k forks source link

[feature requirements] specify hw devices in container #60748

Closed resouer closed 1 year ago

resouer commented 6 years ago

Is this a BUG REPORT or FEATURE REQUEST?: /kind feature

What happened: The previous discussion happened in #5607, which is pretty old and with no background of CRI, no to mention Device Plugin. Also, I believe the original requirement in #5607 should have already been fixed by Device Plugin.

While new requirements showing up is: how to specify devices in container by user, or per DP requirement? And how can we make this work with current DP design.

containers:
  - name: demo2
    image: sfc-dev-plugin:latest
    # I am not proposing this API, just to clarify the use case
    hostDevices:
    - /dev/foo1
    - /dev/foo2

This is actually needed during implementing many Device Plugins. One example is RDMA: https://github.com/hustcat/k8s-rdma-device-plugin, in which case,/dev/infiniband/rdma_cm should be passed in all container which use RDMA device for run RDMA application in container.

@hustcat Please correct me if I mis-understood sth.

Others devices including: /dev/dri/renderD128 /dev/infiniband etc.

What you expected to happen:

We may need to collect user requirements first, and re-visit DP to see how to support this.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

cc @RenaudWasTaken @vikaschoudhary16 @derekwaynecarr @vishh @jiayingz

Environment:

resouer commented 6 years ago

Here's one solution (workaround), which has been used by us to handle https://github.com/kubernetes/kubernetes/issues/49964

Return whatever the container need in AllocateResponse, so those devices will be attached in container, while kubelet does not (and no need) aware that.

rpc Allocate(AllocateRequest) returns (AllocateResponse) {}

We can do this in a separate dummy DP.

vikaschoudhary16 commented 6 years ago

seems related https://github.com/kubernetes/kubernetes/issues/59380#issuecomment-366171312

As we discussed offline, it seems another example of unlimited resources like /dev/kvm

vikaschoudhary16 commented 6 years ago

while kubelet does not (and no need to) aware that.

In normal case also, kubelet is not aware what is being passed to CRI. :)

resouer commented 6 years ago

Yes, I believe /dev/infiniband/rdma_cm case can drop in #59380. We will move discussion there.

I will still keep this issue open to track other use cases as well.

hustcat commented 6 years ago

@resouer Great!! It's wonderful, I like this simple way to support passing host devices to container :)

hustcat commented 6 years ago

If device plugin supporting resource share among pods, then DP can cover this problem. https://docs.google.com/document/d/1ZgKH_K4SEfdiE_OfxQ836s4yQWxZfSjS288Tq9YIWCA/edit?disco=AAAAB1hQAk8

balboah commented 6 years ago

Other use cases is for accessing /dev/video0 for processing webcam input and other IoT things without adding a privileged security context

fejta-bot commented 6 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

nikhita commented 6 years ago

/remove-lifecycle stale

fejta-bot commented 6 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot commented 5 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten

resouer commented 5 years ago

/remove-lifecycle rotten

fejta-bot commented 5 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

runningman84 commented 5 years ago

/remove-lifecycle stale

nobleXu commented 5 years ago

Another use case is /dev/pci_xxx for attaching pci devices on host.

When can this issue be solved?

paulhodson commented 5 years ago

I'm working with device plugins and also wish for this feature. I have a varying number of devices per node so the ability to map all devices of a type /dev/abc* would be ideal. Currently the device plugin container (and other init containers it depends on) have to map /dev and require privileged: true which is not a good fit for secure systems.

OJFord commented 5 years ago

Another one is /dev/fuse, which with docker requires --device=/dev/fuse --cap-add=SYS_ADMIN, but in k8s today needs hostPath: { path: /dev/fuse } ; securityContext: { privileged: true }.

fejta-bot commented 5 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

OJFord commented 5 years ago

/remove-lifecycle stale

fejta-bot commented 4 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

wilmardo commented 4 years ago

/remove-lifecycle stale

fejta-bot commented 4 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

ayanamist commented 4 years ago

/remove-lifecycle stale

fejta-bot commented 4 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

OJFord commented 4 years ago

/remove-lifecycle stale

fejta-bot commented 4 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

runningman84 commented 4 years ago

/remove-lifecycle stale

fejta-bot commented 3 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

wilderbridge commented 3 years ago

See https://gitlab.com/arm-research/smarter/smarter-device-manager from ARM that probably solves most of the use cases mentioned in this issue.

pre commented 3 years ago

Thank you A LOT @tanskann !! This was the missing clue I was really looking for.

I wrote down how to mount /dev/fuse without privileged: true here: https://github.com/kubernetes/kubernetes/issues/7890#issuecomment-766088805

fejta-bot commented 3 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten

instantlinux commented 3 years ago

My use-case: a weather-station that needs to talk to /dev/ttyUSB0. /remove-lifecycle rotten

fejta-bot commented 3 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale

matthyx commented 3 years ago

/kind feature

k8s-triage-robot commented 3 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten

asinitson commented 3 years ago

/remove-lifecycle rotten

This is still very applicable, especially in IoT use cases.

k8s-triage-robot commented 3 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

wilmardo commented 2 years ago

/remove-lifecycle rotten

Still has a usecase and not resolved

leogr commented 2 years ago

Another compelling use case for this feature is Falco.

The official documentation has an example of running it with the principle of least privilege in docker :point_down: https://falco.org/docs/getting-started/running/#docker-least-privileged

But it is not possible in K8s because of this missing feature.

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

benlongo commented 2 years ago

I also need access to USB devices /remove-lifecycle stale

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

leogr commented 2 years ago

/remove-lifecycle stale

j-fuentes commented 2 years ago

I believe this is still fresh. I think it is still only possible to access something like /dev/ttyUSB0 using privileged: true

/remove-lifecycle stale

pre commented 2 years ago

I believe this is still fresh. I think it is still only possible to access something like /dev/ttyUSB0 using privileged: true

The device-manager api allows mounting special devices without privileged:true.

At least this approach worked for /dev/fuse: https://github.com/kubernetes/kubernetes/issues/60748#issuecomment-766089063

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

zvonkok commented 1 year ago

FYI: https://github.com/container-orchestrated-devices/container-device-interface

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned