kubernetes-sigs / wg-device-management

Prototypes and experiments for WG Device Management.
Apache License 2.0
4 stars 5 forks source link

PodSpec and more examples #11

Closed johnbelamaric closed 1 month ago

johnbelamaric commented 1 month ago

This PR proposes how we can resolve the issue of ensuring matchAttributes across all claims in a pod spec. See testdata/podspec.go for what the PodSpec might look like. But here's an explanation:

In 1.30, we have a list of named sources. The sources are oneOfs that could be either a claim name or a template name. The names are used to associate individual claims with containers.

In the prototype model, we are adding matchAttributes constraints to control consistency within a selection of devices. In particular, we want to be able to specify a matchAttributes constraint across two separate named sources, so that we can ensure for example, a GPU chosen for one container is the same model as one chosen for another container. This would imply we need matchAttributes that apply across the list present in PodSpec. However, we don't want to put things like matchAttributes into PodSpec, since it is already v1.

In this PR, instead of a list of named sources, with each source being a oneOf, we instead have a single oneOf in the PodSpec. This oneOf could be:

The first of these allows for our simplest of use cases to be very simple to express, without creating a secondary object to which we must then refer.

The second of these (template) allows claims which follow the lifecycle of the pod. Since a top-level API claim spec can can contain multiple claim instances, this should equally as expressive as "unrolling" the a claimspec. A name has been added to those instances to allow them to be referred to in the containerspec.

The third (claimName) allows the user to share a pre-provisioned claim between pods.

k8s-ci-robot commented 1 month ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: johnbelamaric

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/kubernetes-sigs/wg-device-management/blob/main/OWNERS)~~ [johnbelamaric] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
thockin commented 1 month ago

I think I get it. Since there's only one external claim, you can just refer to the named stanzas within, right? I'm a little concerned about the idea that a pod can only reference one external claim, though. Is that realistic? It's different than volumes, for sure.

I'm OK to move ahead, if this is the only smell left, I am probably willing to hold my nose. We have enough other things to pin down in the meantime.

johnbelamaric commented 1 month ago

Ok, thanks @thockin. I am going to merge so we can iterate more. That will let others submit PRs against this more easily if they want.

I will add this as an open question, but I am gathering some others too which I will add in subsequent PRs.

klueska commented 1 month ago

@thockin @johnbelamaric I think we still need a list of deviceClaims per pod. Otherwise we have no way to include some set of devices that match on an attribute and some that don't. That said, I think this would be made clearer if the objects included here weren't called claims, but instead something like 'ClaimGroup' and then what we currently call devices inside these objects were the actual Claims themselves.

So translating / expanding on @johnbelamaric's example here: https://github.com/kubernetes-sigs/wg-device-management/blob/main/k8srm-prototype/testdata/pod-two-containers-two-gpus-1.yaml

---
apiVersion: devmgmtproto.k8s.io/v1alpha1
kind: DeviceClaimGroup
metadata:
  name: example.com-foozer-two-separate-gpus-same-model
  namespace: default
spec:
  matchAttributes:
  - model
  claims:
  - name: foozer-gpu
    allOf:
    - class: example.com-foozer
  - name: other-foozer-gpu
    allOf:
    - class: example.com-foozer

---
apiVersion: devmgmtproto.k8s.io/v1alpha1
kind: DeviceClaimGroup
metadata:
  name: example.com-foozer-third-gpu-specific-model
  namespace: default
spec:
  claims:
  - name: third-foozer-gpu
    allOf:
    - class: example.com-foozer
       selector: "device.model == 'latest'"

---
apiVersion: v1
kind: Pod
metadata:
  name: foozer
  namespace: default
spec:
  containers:
  - image: registry.k8s.io/pause:3.6
    name: my-container
    resources:
      requests:
        cpu: 10m
        memory: 10Mi
    devices:
    - name: foozer-gpu
  - image: registry.k8s.io/pause:3.6
    name: my-other-container
    resources:
      requests:
        cpu: 10m
        memory: 10Mi
    devices:
    - name: other-foozer-gpu
  deviceClaims:
  - template:
      claimName: example.com-foozer-two-separate-gpus-same-model
  - template:
      claimName: example.com-foozer-third-gpu-specific-model