plugin: Deduplicate Relist()s triggered by VM pod start events

sharnoff commented 8 months ago

Fixes #851, refer there for more context.

This PR adds a new feature to util/watch - the ability to fetch the upper bound on the number of Relists that have taken place so far (approximately), and trigger relisting only if there hasn't been any more.

More symbolically, it adds the ability to fetch a "timestamp"-like token from watch.Store, and say "please relist if you haven't yet, since this timestamp".

This allows us to fix the duplicate Relist()s issue by attaching this token to each pod start event as they come in, meaning that only pod events received after the most recent Relist() started will trigger another one. Quoting one of the commits:

With this change, the amount of required Relist()s are determined not by the number of events, but by the actual time it takes to complete the Relist() (a new one is only necessary if an event came in after the previous one completed - so events are delayed by at most 2x the duration of a single Relist() call, assuming negligible other handling).

I tested this locally and have confirmed that it fixes #851. Details collapsed below:

Testing details

In general, the testing plan was: 1. Add artificial delay to processing pod start events 2. Create two VMs at the same time, that _don't_ go through the scheduler (so they won't exist in local state) 3. After the VM pod start events are queued, delete the VMs 4. Profit? (jk) I used this patch to modify the scheduler: ```diff diff --git a/pkg/plugin/plugin.go b/pkg/plugin/plugin.go index 7c76c50..d07d833 100644 --- a/pkg/plugin/plugin.go +++ b/pkg/plugin/plugin.go @@ -133,7 +133,12 @@ func makeAutoscaleEnforcerPlugin( } pwc := podWatchCallbacks{ submitStarted: func(logger *zap.Logger, pod *corev1.Pod, vmRelistCount watch.RelistCount) { - pushToQueue(logger, func() { p.handleStarted(hlogger, pod, vmRelistCount) }) + pushToQueue(logger, func() { + hlogger.Info("DEBUG: SLEEPING FOR 10 SECONDS", util.PodNameFields(pod)) + time.Sleep(time.Second * 10) + hlogger.Info("DEBUG: DONE SLEEPING", util.PodNameFields(pod)) + p.handleStarted(hlogger, pod, vmRelistCount) + }) }, submitDeletion: func(logger *zap.Logger, name util.NamespacedName) { pushToQueue(logger, func() { p.handleDeletion(hlogger, name) }) ``` ... and this one, to modify `vm-deploy.yaml`: ```diff diff --git a/vm-deploy.yaml b/vm-deploy.yaml index e549f6e..bdae573 100644 --- a/vm-deploy.yaml +++ b/vm-deploy.yaml @@ -11,7 +11,6 @@ metadata: # Set to "true" to continuously migrate the VM (TESTING ONLY) autoscaling.neon.tech/testing-only-always-migrate: "false" spec: - schedulerName: autoscale-scheduler enableSSH: true guest: cpus: { min: 0.25, use: 0.25, max: 1.25 } ``` I also created a copy of `vm-deploy.yaml` with a different name, to act as the second VM. Commands for testing were: ```sh kubectl apply -f vm-deploy.yaml && kubectl apply -f vm-deploy2.yaml kubectl delete neonvm postgres16-disk-test postgres16-disk-test-2 ``` and this was the scheduler's log output, showing that it relisted only once, even though both VMs were missing: ``` {"level":"info","ts":1709965015.6497762,"logger":"autoscale-scheduler.vm-watch","caller":"plugin/watch.go:291","msg":"Skipping update for VM because .status.podName is empty","virtualmachine":{"namespace":"default","name":"postgres16-disk-test"}} {"level":"info","ts":1709965015.7465107,"logger":"autoscale-scheduler.vm-watch","caller":"plugin/watch.go:291","msg":"Skipping update for VM because .status.podName is empty","virtualmachine":{"namespace":"default","name":"postgres16-disk-test-2"}} {"level":"info","ts":1709965018.1640103,"logger":"autoscale-scheduler.pod-watch","caller":"plugin/watch.go:140","msg":"Received update event for Pod now running","pod":{"namespace":"default","name":"postgres16-disk-test-n6wkf"}} {"level":"info","ts":1709965018.1640441,"logger":"autoscale-scheduler.handlers","caller":"plugin/plugin.go:137","msg":"DEBUG: SLEEPING FOR 10 SECONDS","virtualmachine":{"namespace":"default","name":"postgres16-disk-test"},"pod":{"namespace":"default","name":"postgres16-disk-test-n6wkf"}} {"level":"info","ts":1709965018.1794043,"logger":"autoscale-scheduler.pod-watch","caller":"plugin/watch.go:140","msg":"Received update event for Pod now running","pod":{"namespace":"default","name":"postgres16-disk-test-2-2z8j7"}} {"level":"info","ts":1709965024.2002063,"logger":"autoscale-scheduler.pod-watch","caller":"plugin/watch.go:146","msg":"Received update event for completion of Pod","pod":{"namespace":"default","name":"postgres16-disk-test-n6wkf"}} {"level":"info","ts":1709965024.2221467,"logger":"autoscale-scheduler.pod-watch","caller":"plugin/watch.go:173","msg":"Received delete event for completed Pod","pod":{"namespace":"default","name":"postgres16-disk-test-n6wkf"}} {"level":"info","ts":1709965024.2392888,"logger":"autoscale-scheduler.pod-watch","caller":"plugin/watch.go:146","msg":"Received update event for completion of Pod","pod":{"namespace":"default","name":"postgres16-disk-test-2-2z8j7"}} {"level":"info","ts":1709965024.2548335,"logger":"autoscale-scheduler.pod-watch","caller":"plugin/watch.go:173","msg":"Received delete event for completed Pod","pod":{"namespace":"default","name":"postgres16-disk-test-2-2z8j7"}} {"level":"info","ts":1709965028.16438,"logger":"autoscale-scheduler.handlers","caller":"plugin/plugin.go:139","msg":"DEBUG: DONE SLEEPING","virtualmachine":{"namespace":"default","name":"postgres16-disk-test"},"pod":{"namespace":"default","name":"postgres16-disk-test-n6wkf"}} {"level":"info","ts":1709965028.1644406,"logger":"autoscale-scheduler.handlers","caller":"plugin/state.go:587","msg":"Handling Pod start event","action":"Pod started","node":"neonvm-root-worker","virtualmachine":{"namespace":"default","name":"postgres16-disk-test"},"pod":{"namespace":"default","name":"postgres16-disk-test-n6wkf"}} {"level":"warn","ts":1709965028.1644697,"logger":"autoscale-scheduler.handlers","caller":"plugin/plugin.go:284","msg":"VM is missing from local store. Relisting if it hasn't already been done","action":"Pod started","node":"neonvm-root-worker","virtualmachine":{"namespace":"default","name":"postgres16-disk-test"},"pod":{"namespace":"default","name":"postgres16-disk-test-n6wkf"},"pod":{"namespace":"default","name":"postgres16-disk-test-n6wkf"},"virtualmachine":{"namespace":"default","name":"postgres16-disk-test"}} {"level":"info","ts":1709965028.1646848,"logger":"autoscale-scheduler.vm-watch.watch","caller":"watch/watch.go:334","msg":"Relisting"} {"level":"info","ts":1709965028.166872,"logger":"autoscale-scheduler.vm-watch.watch","caller":"watch/watch.go:427","msg":"Relist complete, restarting watcher"} {"level":"error","ts":1709965028.1670778,"logger":"autoscale-scheduler.handlers","caller":"plugin/state.go:617","msg":"Error getting VM info for Pod","action":"Pod started","node":"neonvm-root-worker","virtualmachine":{"namespace":"default","name":"postgres16-disk-test"},"pod":{"namespace":"default","name":"postgres16-disk-test-n6wkf"},"error":"Could not find VM for pod, even after relist","stacktrace":"github.com/neondatabase/autoscaling/pkg/plugin.(*AutoscaleEnforcer).reserveResources\n\t/workspace/pkg/plugin/state.go:617\ngithub.com/neondatabase/autoscaling/pkg/plugin.(*AutoscaleEnforcer).handleStarted\n\t/workspace/pkg/plugin/state.go:589\ngithub.com/neondatabase/autoscaling/pkg/plugin.makeAutoscaleEnforcerPlugin.func3.1\n\t/workspace/pkg/plugin/plugin.go:140\ngithub.com/neondatabase/autoscaling/pkg/plugin.makeAutoscaleEnforcerPlugin.func11\n\t/workspace/pkg/plugin/plugin.go:225"} {"level":"info","ts":1709965028.167148,"logger":"autoscale-scheduler.handlers","caller":"plugin/plugin.go:137","msg":"DEBUG: SLEEPING FOR 10 SECONDS","virtualmachine":{"namespace":"default","name":"postgres16-disk-test-2"},"pod":{"namespace":"default","name":"postgres16-disk-test-2-2z8j7"}} {"level":"info","ts":1709965038.1678114,"logger":"autoscale-scheduler.handlers","caller":"plugin/plugin.go:139","msg":"DEBUG: DONE SLEEPING","virtualmachine":{"namespace":"default","name":"postgres16-disk-test-2"},"pod":{"namespace":"default","name":"postgres16-disk-test-2-2z8j7"}} {"level":"info","ts":1709965038.1678467,"logger":"autoscale-scheduler.handlers","caller":"plugin/state.go:587","msg":"Handling Pod start event","action":"Pod started","node":"neonvm-root-worker","virtualmachine":{"namespace":"default","name":"postgres16-disk-test-2"},"pod":{"namespace":"default","name":"postgres16-disk-test-2-2z8j7"}} {"level":"warn","ts":1709965038.1678603,"logger":"autoscale-scheduler.handlers","caller":"plugin/plugin.go:284","msg":"VM is missing from local store. Relisting if it hasn't already been done","action":"Pod started","node":"neonvm-root-worker","virtualmachine":{"namespace":"default","name":"postgres16-disk-test-2"},"pod":{"namespace":"default","name":"postgres16-disk-test-2-2z8j7"},"pod":{"namespace":"default","name":"postgres16-disk-test-2-2z8j7"},"virtualmachine":{"namespace":"default","name":"postgres16-disk-test-2"}} {"level":"error","ts":1709965038.167872,"logger":"autoscale-scheduler.handlers","caller":"plugin/state.go:617","msg":"Error getting VM info for Pod","action":"Pod started","node":"neonvm-root-worker","virtualmachine":{"namespace":"default","name":"postgres16-disk-test-2"},"pod":{"namespace":"default","name":"postgres16-disk-test-2-2z8j7"},"error":"Could not find VM for pod, even after relist","stacktrace":"github.com/neondatabase/autoscaling/pkg/plugin.(*AutoscaleEnforcer).reserveResources\n\t/workspace/pkg/plugin/state.go:617\ngithub.com/neondatabase/autoscaling/pkg/plugin.(*AutoscaleEnforcer).handleStarted\n\t/workspace/pkg/plugin/state.go:589\ngithub.com/neondatabase/autoscaling/pkg/plugin.makeAutoscaleEnforcerPlugin.func3.1\n\t/workspace/pkg/plugin/plugin.go:140\ngithub.com/neondatabase/autoscaling/pkg/plugin.makeAutoscaleEnforcerPlugin.func11\n\t/workspace/pkg/plugin/plugin.go:225"} {"level":"info","ts":1709965038.1678982,"logger":"autoscale-scheduler.handlers","caller":"plugin/state.go:767","msg":"Handling deletion of VM pod","action":"VM deletion","pod":{"namespace":"default","name":"postgres16-disk-test-n6wkf"}} {"level":"warn","ts":1709965038.1679075,"logger":"autoscale-scheduler.handlers","caller":"plugin/state.go:796","msg":"Cannot find Pod in global pods map","action":"VM deletion","pod":{"namespace":"default","name":"postgres16-disk-test-n6wkf"}} {"level":"info","ts":1709965038.167914,"logger":"autoscale-scheduler.handlers","caller":"plugin/state.go:771","msg":"Deleted Pod","action":"VM deletion","pod":{"namespace":"default","name":"postgres16-disk-test-n6wkf"},"migrating":false,"verdict":{"cpu":"","mem":""}} {"level":"info","ts":1709965038.1679213,"logger":"autoscale-scheduler.handlers","caller":"plugin/state.go:767","msg":"Handling deletion of VM pod","action":"VM deletion","pod":{"namespace":"default","name":"postgres16-disk-test-2-2z8j7"}} {"level":"warn","ts":1709965038.1679294,"logger":"autoscale-scheduler.handlers","caller":"plugin/state.go:796","msg":"Cannot find Pod in global pods map","action":"VM deletion","pod":{"namespace":"default","name":"postgres16-disk-test-2-2z8j7"}} {"level":"info","ts":1709965038.1679342,"logger":"autoscale-scheduler.handlers","caller":"plugin/state.go:771","msg":"Deleted Pod","action":"VM deletion","pod":{"namespace":"default","name":"postgres16-disk-test-2-2z8j7"},"migrating":false,"verdict":{"cpu":"","mem":""}} ```

NB: I intend to merge this PR via "rebase and merge" to keep the separate commits.

Notes for reviewers: Probably easiest to review each commit one by one. Happy to receive suggestions for alternate naming — not a huge fan of RelistCount; ideally it'd make the implications a little more clear, idk. This PR also ended up much larger than I thought it would, but I think about as complex as @shayanh expected :sweat_smile:

Also, I'll probably build locally & deploy to staging for testing, so we can do a hotfix early next week.

shayanh commented 8 months ago

Thanks for opening this PR! I looked over the changes and I need to spent more time to understand the changes better.

Our watch and relist logic is already complex and these changes makes it even more complicated. We should be careful about complexity here.

sharnoff commented 7 months ago

I think this actually may not be necessary; there's other, simpler ways around the issue. Will open an issue when I get a chance.

edit: relevant issue: #863

sharnoff commented 7 months ago

Converted to draft because we'll probably bypass via #863.

Omrigan commented 7 months ago

Resolved via #863

neondatabase / autoscaling

plugin: Deduplicate Relist()s triggered by VM pod start events #853