hashicorp / nomad-autoscaler

Nomad Autoscaler brings autoscaling to your Nomad workloads.
Mozilla Public License 2.0
429 stars 83 forks source link

IdentifyScaleInNodes scales all nodes down #572

Open schneefux opened 2 years ago

schneefux commented 2 years ago

I'm using the nomad-hcloud-autoscaler plugin by @AndrewChubatiuk (thank you for publishing it!). I noticed that it tries to drain & destroy all nodes on scale in.

I think this is caused by a regression in the autoscaler API: The plugin calls IdentifyScaleInNodes(config, int(count)). I looked at the current method definition and in the latest release and the second argument called num is unused. Previously, it would be read by a loop that selects num nodes to be destroyed from the list of live nodes, now it returns the whole array of live nodes instead.

I'm reporting this here because the method RunPreScaleInTasks is annotated with a comment that says COMPAT(0.4), I assume that you're trying to be backwards-compatible and that this was not an intentional change.

lgfa29 commented 2 years ago

Thank you for the report @schneefux, this does seem like an unintended breaking change 🤦

We kept the interface to avoid breaking external plugins, but end up not matching the internal logic behaviour.

AndrewChubatiuk commented 2 years ago

@schneefux thank you! I've created a PR with small changes to a code. I have no ability to test now. You can try if you have time https://github.com/AndrewChubatiuk/nomad-hcloud-autoscaler/pull/2

schneefux commented 2 years ago

@AndrewChubatiuk I've been running it in production for the past few days and it works very well, thank you for the update 🙂

lgfa29 commented 2 years ago

Thank you for the fix @AndrewChubatiuk!

I haven't been able to investigate this further to try and revert the breaking change, but I'm glad you were able to fix it in the plugin.

Sorry for the trouble 😅

karelorigin commented 1 year ago

Hi! 👋

Unfortunately the issue still exists. I had to use RunPreScaleInTasksWithRemoteCheck to work around the problem.