neondatabase / autoscaling

Postgres vertical autoscaling in k8s
Apache License 2.0
164 stars 21 forks source link

neonvm: introduce CPU sysfs state scaling flow based on the vmSpec.cpuScalingMode #1111

Open mikhail-sakhnov opened 1 month ago

mikhail-sakhnov commented 1 month ago

Introduce separate CPU scaling flow based on the vmSpec.cpuScalingMode

If vmSpec.cpuScalingMode is equal to qmpScaling the logic of the scaling is preserved as before:

if vmSpec.cpuScalingMode is equal to sysfsScaling all cpu scaling requests go directly to the vm-runner /cpu_change, which in that configuration goes to the neonvm-daemon to reconcile required amount of online CPUs.

Value cpuSysfsStateScaling also modifies the qemu and the kernel arguments to enable plug all CPUs but mark as online only first one.

mikhail-sakhnov commented 1 month ago

FYI, @sharnoff We discussed with @Omrigan that it would be great to reflect the currently used cpuScalingMode in status or annotations for better observability plus probably we want to restart pod if the cpuScalingMode was changed - that actually would allow us to perform gradual testing manually without switching default mode.

github-actions[bot] commented 1 month ago

Merging this branch will decrease overall coverage

Impacted Packages Coverage Δ :robot:
github.com/neondatabase/autoscaling/neonvm-controller/cmd 0.00% (ø)
github.com/neondatabase/autoscaling/neonvm-daemon/cmd 0.00% (ø)
github.com/neondatabase/autoscaling/neonvm-runner/cmd 0.00% (ø)
github.com/neondatabase/autoscaling/neonvm/apis/neonvm/v1 0.56% (-0.00%) :thumbsdown:
github.com/neondatabase/autoscaling/pkg/neonvm/controllers 11.74% (-0.05%) :thumbsdown:
github.com/neondatabase/autoscaling/pkg/neonvm/controllers/functests 0.00% (ø)
github.com/neondatabase/autoscaling/pkg/neonvm/cpuscaling 0.00% (ø)

Coverage by file ### Changed files (no unit tests) | Changed File | Coverage Δ | Total | Covered | Missed | :robot: | |--------------|------------|-------|---------|--------|---------| | github.com/neondatabase/autoscaling/neonvm-controller/cmd/main.go | 0.00% (ø) | 134 (+2) | 0 | 134 (+2) | | | github.com/neondatabase/autoscaling/neonvm-daemon/cmd/main.go | 0.00% (ø) | 57 (+28) | 0 | 57 (+28) | | | github.com/neondatabase/autoscaling/neonvm-runner/cmd/main.go | 0.00% (ø) | 794 (+7) | 0 | 794 (+7) | | | github.com/neondatabase/autoscaling/neonvm/apis/neonvm/v1/virtualmachine_types.go | 1.92% (ø) | 52 | 1 | 51 | | | github.com/neondatabase/autoscaling/neonvm/apis/neonvm/v1/zz_generated.deepcopy.go | 0.00% (ø) | 427 (+4) | 0 | 427 (+4) | | | github.com/neondatabase/autoscaling/pkg/neonvm/controllers/config.go | 0.00% (ø) | 0 | 0 | 0 | | | github.com/neondatabase/autoscaling/pkg/neonvm/controllers/vm_controller.go | 27.11% (**+1.24%**) | 653 (-8) | 177 (+6) | 476 (-14) | :thumbsup: | | github.com/neondatabase/autoscaling/pkg/neonvm/controllers/vm_controller_cpu_scaling.go | 0.00% (ø) | 65 (+65) | 0 | 65 (+65) | | | github.com/neondatabase/autoscaling/pkg/neonvm/cpuscaling/sysfsscaling.go | 0.00% (ø) | 91 (+91) | 0 | 91 (+91) | | _Please note that the "Total", "Covered", and "Missed" counts above refer to ***code statements*** instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code._ ### Changed unit test files - github.com/neondatabase/autoscaling/pkg/neonvm/controllers/functests/vm_controller_test.go - github.com/neondatabase/autoscaling/pkg/neonvm/controllers/vm_controller_unit_test.go - github.com/neondatabase/autoscaling/pkg/neonvm/controllers/vm_qmp_test.go

HTML Report

Click to open

mikhail-sakhnov commented 3 weeks ago

@sharnoff @Omrigan I noticed we don't have readiness probe for the vm pod. Wdyt about adding readiness probe to check neonvm-daemon socket availability?

mikhail-sakhnov commented 1 week ago

@Omrigan

Didn't do another full pass yet, but a couple of notes: There are couple of unaddressed comments, maybe they slipped from you because they are on now-outdated code. They are visible in "Conversation" tab.

As I mentined, I haven't found any more unaddressed comments.

Do you plan to merge it with rebase or with squash?

I don't have any specific preference here, what do you suggest?