Add internal E2E module to benchmark realistic data

codefromthecrypt commented 11 months ago

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

Our benchmarks still aren't realistic even in simple case, because the inputs are not realistic.

This adds internal/e2e to allow us to benchmark with realistic data. This starts with real node and pod data contributed by @sanposhiho.

Which issue(s) this PR fixes:

NONE

Special notes for your reviewer:

Example run from my laptop.

goos: darwin
goarch: arm64
pkg: sigs.k8s.io/kube-scheduler-wasm-extension/internal/e2e
BenchmarkUnmarshalVT
BenchmarkUnmarshalVT/node:_small
BenchmarkUnmarshalVT/node:_small-12              1425752           839.0 ns/op
BenchmarkUnmarshalVT/node:_real
BenchmarkUnmarshalVT/node:_real-12                 94165         12860 ns/op
BenchmarkUnmarshalVT/pod:_small
BenchmarkUnmarshalVT/pod:_small-12               1375509           865.8 ns/op
BenchmarkUnmarshalVT/pod:_real
BenchmarkUnmarshalVT/pod:_real-12                 101176         11756 ns/op
BenchmarkPluginFilter
BenchmarkPluginFilter/noop
BenchmarkPluginFilter/noop/params:_small
BenchmarkPluginFilter/noop/params:_small-12      7567718           154.5 ns/op
BenchmarkPluginFilter/noop/params:_real
BenchmarkPluginFilter/noop/params:_real-12       7722583           151.4 ns/op
BenchmarkPluginFilter/filter-simple
BenchmarkPluginFilter/filter-simple/params:_small
BenchmarkPluginFilter/filter-simple/params:_small-12               79257         13844 ns/op
BenchmarkPluginFilter/filter-simple/params:_real
BenchmarkPluginFilter/filter-simple/params:_real-12                 4758        258571 ns/op
PASS

Does this PR introduce a user-facing change?

NONE

codefromthecrypt commented 11 months ago

sorry about the earlier bad commit message. It seems you cannot cite someone like @sanposhito in the body. I know now!

codefromthecrypt commented 11 months ago

I will rebase things and add copywrite headers when the other PRs are in!

codefromthecrypt commented 11 months ago

@kerthcet @sanposhiho I would like to rebase this and have folks run bench before/after API affecting changes. Can we agree on this process? When changing the design we must do a performance before/after vs assuming changes are free?

codefromthecrypt commented 11 months ago

ok ready. I added no-op which shows raw speed when nothing happens. Certain lifecycle that never read data can be very fast, but the main point here is to have something that shows we don't accidentally decode protos when they aren't used.

codefromthecrypt commented 11 months ago

Updated with real node data contributed by @sanposhiho!

So, on my laptop, you can see the overhead of plugin execution (which reads this), is 1/4 millisecond, but almost nothing if the pod and node data isn't asked for. This is an important infrastructural step, as we can run these tests to make sure we don't accidentally eagerly fetch data.

kerthcet commented 11 months ago

I'll take a look later today or early tomorrow morning, sorry for the limited bandwidth.

sanposhiho commented 11 months ago

@codefromthecrypt

I would like to rebase this and have folks run bench before/after API affecting changes. Can we agree on this process? When changing the design we must do a performance before/after vs assuming changes are free?

Agree with that. Since the performance is the critical part of this project, let's force people to run the bench on every major changes. Can you add a mention on the PR template?

codefromthecrypt commented 11 months ago

ok @sanposhiho I think I got everything. Let me know if this needs to be squashed.

k8s-ci-robot commented 11 months ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: codefromthecrypt, sanposhiho

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/kubernetes-sigs/kube-scheduler-wasm-extension/blob/main/OWNERS)~~ [sanposhiho] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment

sanposhiho commented 11 months ago

@codefromthecrypt Yep, please squash.

sanposhiho commented 11 months ago

It's a follow up topic after https://github.com/kubernetes-sigs/kube-scheduler-wasm-extension/issues/8. It'd be nice if this perf test is automatically run in CI so that people don't need to run by themselves

codefromthecrypt commented 11 months ago

so @kerthcet once this is merged, I'll help on two things:

add https://github.com/stealthrocket/wzprof to the e2e, so we can get a profiler point of view on why so slow
rework the host functions so that the pod and node can be cached until they change (etag approach)

I have work own the latter staged locally.

kubernetes-sigs / kube-scheduler-wasm-extension