kubernetes-sigs / kube-scheduler-wasm-extension

All the things to make the scheduler extendable with wasm.
Apache License 2.0
110 stars 22 forks source link

Initial proof-of-concept plugin implementation #6

Closed codefromthecrypt closed 1 year ago

codefromthecrypt commented 1 year ago

What type of PR is this?

/kind feature

What this PR does / why we need it:

This adds an implementation that requires an SDK. For example, the guest must export the function "filter".

Note: Go won't work with this until v1.22 because it doesn't yet support //go:wasmexport, even if TinyGo works today.

Which issue(s) this PR fixes:

n/a

Special notes for your reviewer:

This code is in proof-of-concept phase as merging will allow more people to participate. Shortly after, we should setup CI to ensure it continues to compile and run.

Conversion is still required. For example, we are manually choosing fields to convert from v1.PodSpec because its v1.PodSpec.Marshal isn't compatible with: protoapi.IoK8SApiCoreV1PodSpec.UnMarshalVT

goos: darwin
goarch: arm64
pkg: sigs.k8s.io/kube-scheduler-wasm-extension/plugin
BenchmarkFilter
BenchmarkFilter/success:_node_is_match_with_spec.NodeName
BenchmarkFilter/success:_node_is_match_with_spec.NodeName-12               92158         12846 ns/op
PASS

Does this PR introduce a user-facing change?

n/a as unreleased

linux-foundation-easycla[bot] commented 1 year ago

CLA Signed

The committers listed above are authorized under a signed CLA.

k8s-ci-robot commented 1 year ago

Welcome @codefromthecrypt!

It looks like this is your first PR to kubernetes-sigs/kube-scheduler-wasm-extension 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/kube-scheduler-wasm-extension has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. :smiley:

codefromthecrypt commented 1 year ago

ps while this doesn't actually do any protobuf decoding (for reasons mentioned in the PR of help wanted), it does work and has a base overhead of below, which includes time to initialize the module in tinygo.

goos: darwin
goarch: arm64
pkg: sigs.k8s.io/kube-scheduler-wasm-extension/plugin
BenchmarkFilter
BenchmarkFilter/success:_node_is_match_with_spec.NodeName
BenchmarkFilter/success:_node_is_match_with_spec.NodeName-12               29073         45341 ns/op
PASS

In this use case, tinygo is an order of magnitude faster than the WIP Go 1.21. This is expected as it is the first run. I only am testing this to show it works.

goos: darwin
goarch: arm64
pkg: sigs.k8s.io/kube-scheduler-wasm-extension/plugin
BenchmarkFilter
BenchmarkFilter/success:_node_is_match_with_spec.NodeName
BenchmarkFilter/success:_node_is_match_with_spec.NodeName-12                 294       4098859 ns/op
PASS
codefromthecrypt commented 1 year ago

I added a commit to show protobuf on each side, but it is using the unreleased GOOS=wasip1 as tinygo (even dev) doesn't currently compile. If someone knows how to avoid the issues or help raise to tinygo much appreciated as the performance with GOOS=wasip1 isn't good.

goos: darwin
goarch: arm64
pkg: sigs.k8s.io/kube-scheduler-wasm-extension/plugin
BenchmarkFilter
BenchmarkFilter/success:_node_is_match_with_spec.NodeName
BenchmarkFilter/success:_node_is_match_with_spec.NodeName-12                 103      10949407 ns/op
PASS
sanposhiho commented 1 year ago

PTAL at https://github.com/kubernetes-sigs/kube-scheduler-wasm-extension/pull/7 whether it'll help you or not. (I'll push the commit to this branch if you want, instead of the separate PR)

codefromthecrypt commented 1 year ago

I added commits from #7 thanks. What's left is the following before we can proceed to ABI:

  1. plugin/testdata make never responds on my host. I wonder if it is taking a very long to compile or something else. If someone can try, it can verify if tinygo actuaully works now.
  2. In plugin/vfs/plugin.go I have the below, though I don't know that the v1.NodeSpec type isn't already in a format decodable by the protos generated from tinygo. The problem is since I can't complete the guest I can't test it. If we do need to convert first, I hope we can do so automatically because the types are very big.
    case "pod/spec":
        // TODO msg := convert(*v1.PodSpec)
        var msg protoapi.IoK8SApiCoreV1PodSpec
        marshaller = func() ([]byte, error) {
            return proto.Marshal(&msg)
        }

appreciate a hand if anyone can confirm the above

codefromthecrypt commented 1 year ago

ok current status: on @evacchi's advice, I added in github.com/planetscale/vtprotobuf/cmd/protoc-gen-go-vtproto to generate the UnmarshalVT functions. Sadly, things still don't compile (in TinyGo)

TinyGo hangs compiling possibly due to https://github.com/tinygo-org/tinygo/issues/3653#issuecomment-1529847344

codefromthecrypt commented 1 year ago

I switched from using planetscale/vtprotobuf to using @knqyf263's https://github.com/knqyf263/go-plugin to generate the protos (despite no service being defined). This got around the tinygo hang in compilation, so we can proceed!

codefromthecrypt commented 1 year ago

I have the proof-of-concept working now. It has 2 approaches VFS and ABI. I updated the description about the two. Here is an example run of benchmarks between the two:

goos: darwin
goarch: arm64
pkg: sigs.k8s.io/kube-scheduler-wasm-extension/plugin
BenchmarkFilterABI
BenchmarkFilterABI/success:_node_is_match_with_spec.NodeName
BenchmarkFilterABI/success:_node_is_match_with_spec.NodeName-12               176182          6659 ns/op         184 B/op          6 allocs/op
BenchmarkFilterVFS
BenchmarkFilterVFS/success:_node_is_match_with_spec.NodeName
BenchmarkFilterVFS/success:_node_is_match_with_spec.NodeName-12                27487         43850 ns/op      156675 B/op         88 allocs/op
PASS
kerthcet commented 1 year ago

Thanks @codefromthecrypt I'll take a look today.

sanposhiho commented 1 year ago

@codefromthecrypt I haven't looked into the impl, but has this been solved by something then? https://github.com/kubernetes-sigs/kube-scheduler-wasm-extension/pull/6#discussion_r1199854277

codefromthecrypt commented 1 year ago

@sanposhiho

I haven't looked into the impl, but has this been solved by something then? https://github.com/kubernetes-sigs/kube-scheduler-wasm-extension/pull/6#discussion_r1199854277

Nothing has changed on this point. I still want to know why the types aren't compatible. I'm using a similar approach as your wasm POC and just converting an arbitrary field to make it pass. I hope at some point, even after merge, how to address conversion is addressed

codefromthecrypt commented 1 year ago

Also, I want to note that I will improve the unit test coverage, but don't want to do this until the PR main code is accepted. Otherwise it can be a lot of work just to throw away.

codefromthecrypt commented 1 year ago

update: so I spent even more hours trying to find any solution to external generation of Go source that can match what's done in k8s.io/api/core/v1.

TL;DR; This should be raised to its own issue, as data type conversion is a small part of the work, though required for any serious use.

The problem is our Makefile generates wire-incompatible source, so the types are only useful as value types, not as types for marshaling. For example, the k8s library type ServiceAccount type has different wire type numbers than our IoK8sApiCoreV1ServiceAccount.

I also looked at possibly using json instead, but this brings in a different set of compilation problems, as well the performance of converting back and forth from json.

The above commentary is a more elaborated version of this comment due to hours spent trying to get it to work or find any single project that has had success without re-using k8s libraries or json.

codefromthecrypt commented 1 year ago

ps if we have to manually convert, I would highly suggest moving to karmem, which was designed for fast wasm.

Doing exactly the same as this PR, the "abi" guest wasm reduces to 8KB (from 330) and the request time, even without pooling, is <1us vs 6.67us

I suspect even if we flesh out the model, it will still be a lot more efficient.

$ cat kubernetes/karmem/kubernetes.km 
karmem api;

struct Node table {
    Name []char;
}

struct NodeInfo table {
    Node Node;
}

struct PodSpec table {
    NodeName []char;
}

struct Pod table {
    Spec PodSpec;
}

Then using the objects built with karmem build --golang -o api kubernetes.km

guest decoding change

-func (pod) Spec() *protoapi.IoK8SApiCoreV1PodSpec {
+func (pod) Spec() *karmemapi.PodSpec {
        b := imports.PodSpec()
-       var msg protoapi.IoK8SApiCoreV1PodSpec
-       if err := msg.UnmarshalVT(b); err != nil {
-               panic(err)
-       }
-       return &msg
+       podSpec := new(karmemapi.PodSpec)
+       podSpec.ReadAsRoot(karmem.NewReader(b))
+       return podSpec
 }

host encoding change

-               // TODO v.pod.Spec.Marshal is incompatible, find a way to automatically
-               // convert *v1.PodSpec to protoapi.IoK8SApiCoreV1PodSpec
-               var msg protoapi.IoK8SApiCoreV1PodSpec
-               msg.NodeName = v.pod.Spec.NodeName
-               marshaller = msg.MarshalVT
+               marshaller = func() ([]byte, error) {
+                       writer := karmem.NewWriter(1024)
+
+                       content := karmemapi.PodSpec{
+                               NodeName: v.pod.Spec.NodeName,
+                       }
+
+                       if _, err := content.WriteAsRoot(writer); err != nil {
+                               return nil, err
+                       }
+                       return writer.Bytes(), nil
+               }

again, this doesn't even do any pooling of readers or writers.

sanposhiho commented 1 year ago

One thing I mind about Karmem is the limitation of the languages. If we decide to use Karmem, doesn’t it mean that the guest should be implemented in the languages which Karmem supports? (If we consider Golang support only for a while, then not a problem though.

codefromthecrypt commented 1 year ago

@sanposhiho TL;DR; don't worry about karmem for the first round. The real problem is the size of the model, as even in karmem when I make the model complete, it takes a long time per request.


I have an update as I manually converted the entire model to Karmem locally. I wanted to do a fair comparison. Basically, there is not much time difference for what we are doing now. The huge model is the problem in the current benchmark, much more than the specific marshaller. Compared to the results in the PR desc, you can see the baseline overhead is less, but not a huge mount less than proto. It is smaller (96kb), but that also may be due to stripping all the rpc code out of it.

goos: darwin
goarch: arm64
pkg: sigs.k8s.io/kube-scheduler-wasm-extension/plugin
BenchmarkFilterABI
BenchmarkFilterABI/success:_node_is_match_with_spec.NodeName
BenchmarkFilterABI/success:_node_is_match_with_spec.NodeName-12               183386          6450 ns/op
BenchmarkFilterVFS
BenchmarkFilterVFS/success:_node_is_match_with_spec.NodeName
BenchmarkFilterVFS/success:_node_is_match_with_spec.NodeName-12                26743         44040 ns/op
PASS
codefromthecrypt commented 1 year ago

Suggestions after spending many hours on pros and cons of serialization. Main thing is I think we should move ahead and not get too caught in the weeds for the first code. Especially, there are many areas of optimization (host side, guest side etc), as well conversion and modeling work, but these are only viable once an initial impl is in.

codefromthecrypt commented 1 year ago

PS reason I feel it is ok to remove "vfs" to make it more simple is because it is an order of magnitude more base overhead in TinyGo for little purpose. This was for Go 1.21 which doesn’t yet support function exports. I think GOOS=wasip1 (go's 1.21's compilation target) is too slow to use (millisecond scale, vs tens of us in TinyGo), also its wasm is huge and there are less settings to make it less so. I think this all will get better in 1.22, and that version will also support the ABI approach. So, it was good for learning, but not viable IMHO.

Basically, I think we can take Go 1.21 out of scope and use TinyGo, and only the "abi" approach. Less to document and test.

kerthcet commented 1 year ago

Hey guys, I have to confess I'm not quite familiar with wasm tech right now, but I'm learning hard and trying my best to understand it by reading the docs and designs and practice them locally. My question is is this PR the minimalist one, it's quite large and hard to review, or that's how wasm works? And can we have some docs or do we have any plan, future readers will benefit from this as well.

My thought is we can have a minimalism version and let's merge it ASAP, then we can build more things upon it. Does this make sense?

Anyway, really impressed works from @codefromthecrypt I'm actually learning something from your videos from youtube.

sanposhiho commented 1 year ago

@codefromthecrypt

(I forgot to put the comment here as well) I'm OK to remove VFS stuff from here. We can recall VFS idea in the future if we need to.

I think we should move ahead and not get too caught in the weeds for the first code.

Also, I agree with this. Instead of stopping here, it's better to proceed with the current only option Karmem, and focus on the main stuff. We can think more on this after some iteration since the time hopefully solves something on this. (via TinyGo's new version, or Golang official wasi support etc)

I have to confess I'm not quite familiar with wasm tech right now

@kerthcet

No worries, me neither 🙃 For me, reading http-wasm implementation helped me a lot to know the overview of what we want to do in this project, as an example. https://http-wasm.io/http-handler-abi/

My question is is this PR the minimalist one, it's quite large and hard to review, or that's how wasm works? And can we have some docs or do we have any plan, future readers will benefit from this as well.

I agree about the doc. I'd be great if this PR gets some brief doc + example impl of the guest. @codefromthecrypt but I'd say they can be brief ones for now; the implementation is, either way, kind of PoC/draft even after merging, and we can grow them up as we grow the implementation.

Regarding the amount of the code, this PR will get smaller after VFS stuff removal (+ we probably can remove proto stuff as well?) And after that, it looks reasonable to me to review/merge it without splitting into smaller PRs anymore; I'd say that splitting PRs for each of guest/host doesn't help us make it easier to read because we anyway want to read both implementations at the same time.

My thought is we can have a minimalism version and let's merge it ASAP, then we can build more things upon it. Does this make sense?

👍 We may come up with further improvement on this like my rough idea above, we can pile up such optimization later

kerthcet commented 1 year ago

And about the kubernetes.proto, seems we container the whole native CRDs, am I right? And we met the problems with ServiceAccount as @codefromthecrypt mentioned above (have no idea whether this is resolved or not), but we can start with a subset of the CRDs, because for scheduler that's pretty too much. I can think of the necessary API objects as a startup, like:

Can we clip the proto file, where we get this? @sanposhiho

codefromthecrypt commented 1 year ago

I will pare this down to the minimum ABI. PS I plan to continue to use proto, not Karmem, for the time being. Both work, and we can swap proto for Karmem later if we like. I think the best way to start though is with familiar tooling, even if a little less efficient.

kerthcet commented 1 year ago

I will pare this down to the minimum ABI. PS I plan to continue to use proto, not Karmem, for the time being. Both work, and we can swap proto for Karmem later if we like. I think the best way to start though is with familiar tooling, even if a little less efficient.

Agree, we should also consider codebase dependency problems. Just in case some of the project will not maintain in the future.

codefromthecrypt commented 1 year ago

thanks @kerthcet. I assume we are on the same page, so I'll backfill tests etc and ping back for final review.

k8s-ci-robot commented 1 year ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: codefromthecrypt, sanposhiho

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/kubernetes-sigs/kube-scheduler-wasm-extension/blob/main/OWNERS)~~ [sanposhiho] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
sanposhiho commented 1 year ago

/hold

sanposhiho commented 1 year ago

/unhold /remove-approval

OK, learn that the approver's approval is regarded as /approval. 😅

sanposhiho commented 1 year ago

/hold

kerthcet commented 1 year ago

It's merged.... But at least we have a base codeline.

sanposhiho commented 1 year ago

@kerthcet Sorry, my /hold cannot stop him from merging this. (Seems I should have run /remove-approve) Please take a look from your side as well and give some feedback!

@codefromthecrypt please check my small comments above!

codefromthecrypt commented 1 year ago

don't worry I can help address any comments post merge. I will focus on them soon. Thanks for moving forward together!

codefromthecrypt commented 1 year ago

I raised this to track switching to the same protos as the scheduler plugin uses. We'd still use go-plugin to compile it, just a different source, so that the fields and wire types match https://github.com/kubernetes-sigs/kube-scheduler-wasm-extension/issues/13