Open johnbelamaric opened 3 months ago
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: johnbelamaric
The full list of commands accepted by this bot can be found here.
The pull request process is described here
to reduce the size of the objects
That helps reduce the size on average, but for the worst-case analysis which determines the limits of the slices we have to assume that the new Attributes
is fully-populated and all devices have the maximum number of attributes.
to reduce the size of the objects
That helps reduce the size on average, but for the worst-case analysis which determines the limits of the slices we have to assume that the new
Attributes
is fully-populated and all devices have the maximum number of attributes.
Yes. True.
I am making several options, see comment in #20
Option 4 has some nesting. That's #31. It is much more efficient than this one.
We could do more levels. Not clear the payoff is there.
Option 3 makes common attributes. Option 4 makes common attributes AND common partition "map".
I 100% agree we should have shared attributes. My first prototype had it, but at the time you said „let’s not prematurely optimize for size“ so we dropped it.
I’m still not sold on nesting though. I also had it originally (in my „recursive“ device model), but you all (rightly) convinced me to drop it. And now after working with flat devices and updating both the example driver and the NVIDIA GPU driver to adhere to them, I’m really happy with the flexibility that a flat model brings us.
I really don’t think it buys us much to have nesting and I have a strong feeling will come back to bite us fairly quickly.
I was continuing on the KEP PR, but detoured to these option, so I will copy a comment:
I just find it super weird to have gpu1's shared items consumable from gpu0. This is the thing that is setting me on edge. That implies to me a level of fungibility which doesn't exist. There is a grouping that is smaller than a ResourceSlice but bigger than a Device, and we are not modelling it. Call it a "card" for the moment. A slice doesn't have shared resources, a "card" does. Now you'll probably tell me "actually, one card can borrow resources on another card". In fact, I can already see the (hypothetical) use-case for a channelized <something>
which can be effectively RAID'ed into a larger logical device. But that's not this, and (TTBOMK) that doesn't exist yet.
I really can see both sides, and I don't mean to be dogmatic. It just smells funny. Let's keep the conversation overall moving forward, and if this is all that's left, we can hash it out.
My first prototype had it, but at the time you said „let’s not prematurely optimize for size“ so we dropped it.
Yeah, we dropped a LOT to get the baseline, and bringing partitions back makes it clear to me that this is one piece that really does make sense.
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.
This bot triages PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the PR is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
This is an evolution of the partitionable model defined in #27, which moves common attributes up to the pool level to reduce the size of the objects.