ProxLB doesn't think the containers need to be moved

gyptazy / ProxLB

ProxLB - (Re)Balance VM Workloads Across Nodes in Proxmox Clusters. A Load Balancer for Proxmox - and more!

https://proxlb.de

GNU General Public License v3.0

201 stars 9 forks source link

ProxLB doesn't think the containers need to be moved #29

Closed ewenlau closed 3 months ago

ewenlau commented 3 months ago

So I've got two nodes, pve01 and pve02. One has 100GB of RAM, the other 24GB. At the time of writing, both have about 10GB of used RAM, but pve01 has only about 10% of it's RAM used while pve02 has closer to 50%. ProxLB does indeed conclude that the memory usage is not equal between both nodes, but reports wildy different results for RAM usage, which makes me think it uses the free memory instead of the available, which it shouldn't do in my opinion, but that's beyond the scope of this issue: <6> ProxLB: Info: [balanciness-validation]: Rebalancing for memory is needed. Highest usage: 88% | Lowest usage: 46%..

However, it then decides that no rebalancing is needed for some reason <6> ProxLB: Info: [rebalancing-executor]: No rebalancing needed. I don't understand why it decides that nothing should be moved. I have a very clear majority of LXCs in my setup (and few rare VMs), and I used the latest proxlb file in the main branch, and set it to all.

There's also those two lines in the logs which might be indicating that it's running in dry-run mode (which it isn't): <6> ProxLB: Info: [dry-run-output-generator]: Starting dry-run to rebalance vms to their new nodes. <6> ProxLB: Info: [dry-run-output-generator]: No rebalancing needed.

Why is it doing that? I looked in the code but I didn't find any particular issues. I did not check the rebalancing algorithm, so the issue might be in there.

I've linked the log and config files below proxlb.log proxlb.conf.txt

gyptazy commented 3 months ago

Hey @ewenlau,

thanks for you bug report and I guess that's based on my assumptions that you mostly want to run more or less equal nodes in a cluster where the CPU and memory are the same. Of course, you're right and this might not be the case everywhere. But let's have a look at this:

The issue here is that the gap between both nodes is too big and it therefore tries to place it in node01, where it is already placed. You have configured a balanciness of 10 which means that the gap might just differ between 88% on node01 and 78% on node02. Now, it tries to rebalance and sees that the best match with the most free resources is node01, but according to the log the containers are already placed there (see the node_parent and node_rebalanced keys in the JSON).

So, I guess it's more about what should be taken as metric for memory and disk balancing - the used space (to have the overall percentages of resources more equal or the free space to make sure the most available resources are being used).

I'm happy for some input, the behaviour can be changed or even be defined as a config parameter to make everyone comfortable.

The second one with the dry-run mode looks indeed like a bug, but more like a "logging" bug because there isn't anything to rebalance (because parent and rebalancing node are the same). But I'll have a look at it tomorrow.

Thanks for your report and the logs!

Cheers, gyptazy

ewenlau commented 3 months ago

Hey @ewenlau,

thanks for you bug report and I guess that's based on my assumptions that you mostly want to run more or less equal nodes in a cluster where the CPU and memory are the same. Of course, you're right and this might not be the case everywhere. But let's have a look at this:

The issue here is that the gap between both nodes is too big and it therefore tries to place it in node01, where it is already placed. You have configured a balanciness of 10 which means that the gap might just differ between 88% on node01 and 78% on node02. Now, it tries to rebalance and sees that the best match with the most free resources is node01, but according to the log the containers are already placed there (see the node_parent and node_rebalanced keys in the JSON).

So, I guess it's more about what should be taken as metric for memory and disk balancing - the used space (to have the overall percentages of resources more equal or the free space to make sure the most available resources are being used).

I'm happy for some input, the behaviour can be changed or even be defined as a config parameter to make everyone comfortable.

The second one with the dry-run mode looks indeed like a bug, but more like a "logging" bug because there isn't anything to rebalance (because parent and rebalancing node are the same). But I'll have a look at it tomorrow.

Thanks for your report and the logs!

Cheers, gyptazy

Hello, I understand the difference between both nodes is large, but I don't understand how it matters. Doesn't it balance based on the percentage of RAM used rather than the actual GBs? Why is it selecting to move a container from the node which has less memory usage?

gyptazy commented 3 months ago

No, it’s not based on the percentage when rebalancing, it’s based on the currently free memory on the node. The node with the most free memory (in size, not in percentage of the nodes local's view) will be used. Guess, for such setups a config parameter to balance by the % value of the nodes might make more sense to for everyone’s need. Would that fit your needs?

ewenlau commented 3 months ago

I think it would solve that issue, yes.

gyptazy commented 3 months ago

Hey @ewenlau,

maybe you can give the PR #32 a try. It's currently just a dirt-hack since I just want to know, if this is what you want.

Please change https://github.com/gyptazy/ProxLB/blob/main/proxlb.conf#L8 (mode) to free_percent.

If this is what you want, I will create a new dedicated key for this in the options (mode_node) which can be defined if it should use the free resources in bytes or in percent on a node in the cluster.

Hope it helps.

Cheers, gyptazy

ewenlau commented 3 months ago

Hey @ewenlau,

maybe you can give the PR #32 a try. It's currently just a dirt-hack since I just want to know, if this is what you want.

Please change https://github.com/gyptazy/ProxLB/blob/main/proxlb.conf#L8 (mode) to free_percent.

If this is what you want, I will create a new dedicated key for this in the options (mode_node) which can be defined if it should use the free resources in bytes or in percent on a node in the cluster.

Hope it helps.

Cheers, gyptazy

Hello, It looks like it did not change anything. The output is the exact same as before. Looking at the commit, I have the feeling you may have forgotten to implement the change? I might be wrong, but to me it looks like the only change is that is registers the 'free_percent' balancing method as valid and then simply uses the used one. Again, I might be wrong, but seeing as the output of the command barely differs between both options, it's my best guess.

gyptazy commented 3 months ago

Hey @ewenlau,

the related and important change is in https://github.com/gyptazy/ProxLB/pull/32/files#diff-4d47e7584181ff92b3c3f57588b89e4fb11158ac22f3d50066588c07267e5a86R580-R581 where it now obtains the free percent value, instead of the free bytes value if that option is set. This metrics are obtained from the node_statistics dictionary.

In my test it looks like:

# /bin/proxlb -d -c /etc/proxlb/proxlb.conf
node01: 66% free
node02: 84% free
node03: 79% free
Selected node to use: node02

Maybe I misunderstood your request?

ewenlau commented 3 months ago

Hello, It was actually an issue on my part, and the rebalancing part seems to be working fine too, I'd love to see this toggle in the config for Release 1. There's just one slight issue with containers, is that it seems to be wanting to read the config files from the qemu-server folder, which obviously doesn't work for containers, and results in them not migrating.

Here's the log: `<2> ProxLB: Error: [rebalancing-executor]: 500 Internal Server Error: Configuration file 'nodes/pve02/qemu-server/143.conf' does not exist

<6> ProxLB: Info: [rebalancing-executor]: Rebalancing vm plc-panel01 from node pve02 to node pve01. <2> ProxLB: Error: [rebalancing-executor]: 500 Internal Server Error: Configuration file 'nodes/pve02/qemu-server/148.conf' does not exist <6> ProxLB: Info: [rebalancing-executor]: Rebalancing vm authentik01 from node pve02 to node pve01. <2> ProxLB: Error: [rebalancing-executor]: 500 Internal Server Error: Configuration file 'nodes/pve02/qemu-server/144.conf' does not exist <6> ProxLB: Info: [rebalancing-executor]: Rebalancing vm immich01 from node pve02 to node pve01. <2> ProxLB: Error: [rebalancing-executor]: 500 Internal Server Error: Configuration file 'nodes/pve02/qemu-server/147.conf' does not exist <6> ProxLB: Info: [rebalancing-executor]: Rebalancing vm zoraxy01 from node pve02 to node pve01. <2> ProxLB: Error: [rebalancing-executor]: 500 Internal Server Error: Configuration file 'nodes/pve02/qemu-server/140.conf' does not exist <6> ProxLB: Info: [rebalancing-executor]: Rebalancing vm mariadb-main from node pve02 to node pve01. <2> ProxLB: Error: [rebalancing-executor]: 500 Internal Server Error: Configuration file 'nodes/pve02/qemu-server/150.conf' does not exist` There's also these lines which I think shouldn't show up? Correct me if I'm wrong. `<6> ProxLB: Info: [dry-run-output-generator]: Starting dry-run to rebalance vms to their new nodes. <6> ProxLB: Info: [dry-run-output-generator]: Printing cli output of VM rebalancing.` Thanks a lot for your work, I truly think this project is great. Also, sorry for the late replies, I'm on vacations at the moment and haven't got a lot of free time.

gyptazy commented 3 months ago

Hey @ewenlau,

thanks for your reply. Happy to hear that this finally works for you and also thanks for replying the other things. I'll split it into three parts:

Integrating the rebalancing by assigned or percent free resources
Validating why the QEMU migration part is used by CT's instead of the right one
Validating that the log output is matching

This three things are my primary goal for the initial release of 1.0.0 before integrating new features. I hope I can finalize everything the upcoming week.

Thank you!

gyptazy commented 3 months ago

Hey @ewenlau,

with PR #32 I introduced the new option option to to rebalance by the node's free resources in percent instead of bytes. The operation mode for this can be changed by the newly introduced option mode_option which is by default bytes.

A user can define this by setting bytes or percent.

That PR also adds a function to validate if there are objects of type VM or CT to rebalance to avoid raising a stack trace when no objects are present in a cluster (e.g. freshly installed cluster).

If you could just give this another try before merging would be great (it also fixes the log output now). Your other request was fixed with PR #33.

Thanks, gyptazy