Closed johnbelamaric closed 1 month ago
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: johnbelamaric
The full list of commands accepted by this bot can be found here.
The pull request process is described here
I think I get it. Since there's only one external claim, you can just refer to the named stanzas within, right? I'm a little concerned about the idea that a pod can only reference one external claim, though. Is that realistic? It's different than volumes, for sure.
I'm OK to move ahead, if this is the only smell left, I am probably willing to hold my nose. We have enough other things to pin down in the meantime.
Ok, thanks @thockin. I am going to merge so we can iterate more. That will let others submit PRs against this more easily if they want.
I will add this as an open question, but I am gathering some others too which I will add in subsequent PRs.
@thockin @johnbelamaric I think we still need a list of deviceClaims
per pod. Otherwise we have no way to include some set of devices that match on an attribute and some that don't. That said, I think this would be made clearer if the objects included here weren't called claims, but instead something like 'ClaimGroup' and then what we currently call devices
inside these objects were the actual Claims
themselves.
So translating / expanding on @johnbelamaric's example here: https://github.com/kubernetes-sigs/wg-device-management/blob/main/k8srm-prototype/testdata/pod-two-containers-two-gpus-1.yaml
---
apiVersion: devmgmtproto.k8s.io/v1alpha1
kind: DeviceClaimGroup
metadata:
name: example.com-foozer-two-separate-gpus-same-model
namespace: default
spec:
matchAttributes:
- model
claims:
- name: foozer-gpu
allOf:
- class: example.com-foozer
- name: other-foozer-gpu
allOf:
- class: example.com-foozer
---
apiVersion: devmgmtproto.k8s.io/v1alpha1
kind: DeviceClaimGroup
metadata:
name: example.com-foozer-third-gpu-specific-model
namespace: default
spec:
claims:
- name: third-foozer-gpu
allOf:
- class: example.com-foozer
selector: "device.model == 'latest'"
---
apiVersion: v1
kind: Pod
metadata:
name: foozer
namespace: default
spec:
containers:
- image: registry.k8s.io/pause:3.6
name: my-container
resources:
requests:
cpu: 10m
memory: 10Mi
devices:
- name: foozer-gpu
- image: registry.k8s.io/pause:3.6
name: my-other-container
resources:
requests:
cpu: 10m
memory: 10Mi
devices:
- name: other-foozer-gpu
deviceClaims:
- template:
claimName: example.com-foozer-two-separate-gpus-same-model
- template:
claimName: example.com-foozer-third-gpu-specific-model
This PR proposes how we can resolve the issue of ensuring matchAttributes across all claims in a pod spec. See testdata/podspec.go for what the PodSpec might look like. But here's an explanation:
In 1.30, we have a list of named sources. The sources are oneOfs that could be either a claim name or a template name. The names are used to associate individual claims with containers.
In the prototype model, we are adding
matchAttributes
constraints to control consistency within a selection of devices. In particular, we want to be able to specify amatchAttributes
constraint across two separate named sources, so that we can ensure for example, a GPU chosen for one container is the same model as one chosen for another container. This would imply we needmatchAttributes
that apply across the list present in PodSpec. However, we don't want to put things likematchAttributes
intoPodSpec
, since it is already v1.In this PR, instead of a list of named sources, with each source being a oneOf, we instead have a single oneOf in the PodSpec. This oneOf could be:
The first of these allows for our simplest of use cases to be very simple to express, without creating a secondary object to which we must then refer.
The second of these (template) allows claims which follow the lifecycle of the pod. Since a top-level API claim spec can can contain multiple claim instances, this should equally as expressive as "unrolling" the a claimspec. A name has been added to those instances to allow them to be referred to in the containerspec.
The third (claimName) allows the user to share a pre-provisioned claim between pods.