Closed sanposhiho closed 5 months ago
I’d like to focus on one extension point for a while, probably Filter or Score. And, then take some iterations of the performance test -> improve -> ...
The performance overhead is actually what I mind the most. The wasm communication between host and plugin had a pretty worse impact on my first PoC with go-plugins..
According to wazero maintainers (ref1 ref2), the performance is really depending on how we design ABI. Also, I believe we cannot make the perfect one at once, we will need multiple tries to build better ABI.
Once one extension point is completed, we can proceed with other extension points then.
That's my point about how to proceed with.
Apart from how we proceed with, I'd like to have a clear goal of this project.
From the functionality PoV,
And also we should have the goal for "the acceptable overhead". I know we cannot make it as faster as the native Golang scheduler plugin, but, for example, if the wasm extension is slower than the extender (= webhook based scheduler extension), that's definitely not good.
/cc @codefromthecrypt @salaboy @mathetake
Who may want to be interested in. Feel free to ping other people from the wasm side.
I'll create an initial one based on ^, let's discuss there.
/kind document
@sanposhiho: The label(s) kind/document
cannot be applied, because the repository doesn't have them.
/kind documentation
I’d like to focus on one extension point for a while, probably Filter or Score.
I think we can start with one extension point, Filter maybe. For if we can achieve the goal with Filter extension point, we can copy the experience to other points.
@sanposhiho so the performance impact has been pretty dramatic improvement since your POC. Do you feel as a step one this should be re-based, or start from new?
I looked at this, from Filter https://github.com/kubernetes/kubernetes/compare/master...sanposhiho:kubernetes:sanposhiho/poc-wasm-scheduler-plugin#diff-5dd7d68066cc05247e43732c9754ed92ca3a7efc7a44f0c6ac3ee0fa05d97f91R46
How deep information do we want to expose for Pod and NodeInfo? I think the demo was filtering on name only.
How deep information do we want to expose for Pod and NodeInfo? I think the demo was filtering on name only.
I don't know whether I understood your correctly, but scheduler is a centralized component, so it should know everything about the cluster.
I don't know whether I understood your correctly, but scheduler is a centralized component, so it should know everything about the cluster.
I mean what fields and what depth of them should be exposed to the third party code compiled to wasm, which is implementing the filter.
@kerthcet so I think if everything could be required, that's ok just there's a lot of overhead serializing the whole k8s model types available to the plugin. It could be nice to hear folks comment on the common fields needed as this could allow lazy worst case performance (pulling the whole model into memory) vs always worst case.
Do you know common fields you use for filtering besides node and pod name which were in the POC?
Yes, @kerthcet is right. For the scheduler of the upstream kubernetes, that requires the data of
But, given users may want to get another resources, ideally we need to make this extension able to access all fields of any resources in the cluster eventually. (Of course, at first, we can start from exposing very small staff though)
(I'd like to move the discussion from the slack thread for the better visibility)
My assumption is serializing the whole pod spec on every request will not be ok basically https://kubernetes.slack.com/archives/C09TP78DV/p1684495135570359?thread_ts=1684244179.981539&cid=C09TP78DV
I agree with this. As I said, the custom plugin possibly want to access, not only pod spec, but also any fields of any resources in the cluster. Giving all of such object data from host to wasm for every req is pretty expensive.
What I'm imagining is keeping the resources data (on memory or a virtual file system that @codefromthecrypt mentioned?) on where can be access from host/guest, and keep updating them by watching the resource changes in the cluster via event handler. Then, instead of passing tons of resources to guest every time, host only needs to update the object when it's needed. If we provide the interface to know which kind of resources that the guest needs, we'll be able to reduce the amount of resources we need to manage.
@kerthcet so I think if everything could be required, that's ok just there's a lot of overhead serializing the whole k8s model types available to the plugin. It could be nice to hear folks comment on the common fields needed as this could allow lazy worst case performance (pulling the whole model into memory) vs always worst case.
Do you know common fields you use for filtering besides node and pod name which were in the POC?
It depends, but usually, for custom plugins, we heavily depends on the pod annotation for configurations, then for storage specific plugins, we still need the pod volumes. But people may have different usages.
/priority important-soon
I couldn't take time last weekend, but should be able to do this weekend.
/close
@sanposhiho: Closing this issue.
Let's clarify the goal and how we proceed toward it. Eventually, I'd like to summarize them within some doc.
/assign @sanposhiho @kerthcet