gyptazy / ProxLB

ProxLB - (Re)Balance VM Workloads Across Nodes in Proxmox Clusters. A Load Balancer for Proxmox - and more!
https://proxlb.de
GNU General Public License v3.0
189 stars 9 forks source link

`Bug`: Not currently balancing #112

Open JamesOBenson opened 4 days ago

JamesOBenson commented 4 days ago

General

ProxLB is not actually balancing my nodes more than maybe moving 1 VM even after restarting the service: Node 1: Memory Usage 11%, CPU usage 1%; Node 2: Memory Usage: 73%, CPU usage 1%; VMs: 47 Running, 24 stopped, 2 templates, LCX: 10 Running, 0 stopped, 1 template.

Config

[proxmox] api_host: ** api_user: root@pam api_pass: ** verify_ssl: 0 [vm_balancing] enable: 1 method: memory mode: assigned mode_option: percent balanciness: 10 type: all parallel_migrations: 1 [storage_balancing] enable: 0 [update_service] enable: 0 [api] enable: 0 [service] daemon: 1 schedule: 24 log_verbosity: CRITICAL config_version: 3

Meta

Please provide some more information about your setup. This includes where you obtained ProxLB (e.g., as a .deb file, from the repository or container image) and also which version you're running in which mode. You can obtain the used version from you image version, your local repository information or by running proxlb -v.

Version: ProxLB version 1.0.4 Running in VM inside of cluster.

gyptazy commented 4 days ago

Hey @JamesOBenson,

interesting, can you please share the log file (please set log_verbosity to INFO) and restart the service. You can grab the logs from the systemd unit afterwards. You can also simply start it in the dry-run mode on cli where it will print it to stdout.

You switched the mode from used to assigned and the mode_option from bytes to percent. This requires me to have some more information like how much memory all nodes have (all the same size?) and how much memory the VMs really have assigned.

When pasting the log, please strip all information you do not want to share here.

Thanks, gyptazy

JamesOBenson commented 4 days ago

I can revert those changes. But yes, all nodes are configured the same, I thought the percentage would balance the %'s I mentioned earlier. The result was the same though, only 1 VM ever migrated, and since then, nothing. Also, we aren't using shared storage, so I had to modify your code slightly to account for that on line 1183 adding , **{'with-local-disks': 1}: job_id = api_object.nodes(value['node_parent']).qemu(value['vmid']).migrate().post(target=value['node_rebalance'],online=1, targetstorage='1', **{'with-local-disks': 1}) I'll attach the logs in a moment.

JamesOBenson commented 4 days ago

CONFIG FOR TEST CLUSTER

[proxmox] ... [vm_balancing] enable: 1 method: memory mode: used mode_option: bytes balanciness: 10 type: all parallel_migrations: 1 [storage_balancing] enable: 0 [update_service] enable: 0 [api] enable: 0 [service] daemon: 1 schedule: 24 log_verbosity: INFO config_version: 3 proxlb_logs.txt

JamesOBenson commented 4 days ago

One thing that sticks out to me in the logs is that it picks up the 2 nodes here

Oct 17 18:10:09 proxLB proxlb[5026]: ProxLB: Info: [node-statistics]: Added node 32. Oct 17 18:10:09 proxLB proxlb[5026]: ProxLB: Info: [node-statistics]: Added node 33.

But later it seems to state only one of them:

Oct 17 18:10:10 proxLB proxlb[5026]: ProxLB: Warning: [node-update-statistics]: Node 33 is overprovisioned for disk by 102%. Oct 17 18:10:10 proxLB proxlb[5026]: ProxLB: Warning: [node-update-statistics]: Node 33 is overprovisioned for disk by 136%.

JamesOBenson commented 4 days ago

We used the same thing on a second cluster, again, nothing migrated here is an excerpt of the logs:

Oct 17 18:27:24 proxLB proxlb[5099]: ProxLB: Info: [node-update-statistics]: Updated node resource assignments by all VMs. Oct 17 18:27:24 proxLB proxlb[5099]: ProxLB: Info: [balancing-method-validation]: Valid balancing method: memory Oct 17 18:27:24 proxLB proxlb[5099]: ProxLB: Info: [balancing-mode-validation]: Valid balancing method: used Oct 17 18:27:24 proxLB proxlb[5099]: ProxLB: Info: [balanciness-validation]: Rebalancing for memory is needed. Highest usage: 88% | Lowest usage: 22%. Oct 17 18:27:24 proxLB proxlb[5099]: ProxLB: Info: [get-most-used-resources-vm]: ('project', {'group_include': None, 'group_exclude': None, 'cpu_total': 8, 'cpu_used': 0.000665549082148473, 'memory_t> Oct 17 18:27:24 proxLB proxlb[5099]: ProxLB: Info: [get-most-free-resources-nodes]: ('**37', {'maintenance': False, 'ignore': False, 'cpu_total': 40, 'cpu_assigned': 20.00450408832633, 'cpu_assi> Oct 17 18:27:24 proxLB proxlb[5099]: ProxLB: Info: [rebalancing-resource-statistics-update]: Updated VM and node statistics. Oct 17 18:27:24 proxLB proxlb[5099]: ProxLB: Info: [balancing-method-validation]: Valid balancing method: memory Oct 17 18:27:24 proxLB proxlb[5099]: ProxLB: Info: [balancing-mode-validation]: Valid balancing method: used Oct 17 18:27:24 proxLB proxlb[5099]: ProxLB: Info: [balanciness-validation]: Rebalancing for memory is needed. Highest usage: 88% | Lowest usage: 22%. Oct 17 18:27:24 proxLB proxlb[5099]: ProxLB: Info: [get-most-used-resources-vm]: ('UbuntuDaniel', {'group_include': None, 'group_exclude': None, 'cpu_total': 2, 'cpu_used': 0.00163265451052968, 'memo> Oct 17 18:27:24 proxLB proxlb[5099]: ProxLB: Info: [get-most-free-resources-nodes]: ('**37', {'maintenance': False, 'ignore': False, 'cpu_total': 40, 'cpu_assigned': 20.00450408832633, 'cpu_assi> Oct 17 18:27:24 proxLB proxlb[5099]: ProxLB: Info: [rebalancing-resource-statistics-update]: Updated VM and node statistics. Oct 17 18:27:24 proxLB proxlb[5099]: ProxLB: Info: [balancing-method-validation]: Valid balancing method: memory Oct 17 18:27:24 proxLB proxlb[5099]: ProxLB: Info: [balancing-mode-validation]: Valid balancing method: used Oct 17 18:27:24 proxLB proxlb[5099]: ProxLB: Info: [rebalancing-vm-calculator]: Balancing calculations done. Oct 17 18:27:24 proxLB proxlb[5099]: ProxLB: Info: [rebalancing-vm-calculator]: Balancing calculations done. Oct 17 18:27:24 proxLB proxlb[5099]: ProxLB: Info: [rebalancing-vm-calculator]: Balancing calculations done. Oct 17 18:27:24 proxLB proxlb[5099]: ProxLB: Info: [rebalancing-maintenance-vm-calculator]: No nodes for maintenance mode defined. Oct 17 18:27:24 proxLB proxlb[5099]: ProxLB: Info: [vm-rebalancing-executor]: No rebalancing needed. Oct 17 18:27:24 proxLB proxlb[5099]: ProxLB: Info: [cli-output-generator]: Start rebalancing vms to their new nodes. Oct 17 18:27:24 proxLB proxlb[5099]: ProxLB: Info: [cli-output-generator]: No rebalancing needed. Oct 17 18:27:24 proxLB proxlb[5099]: ProxLB: Info: [post-validations]: All post-validations succeeded. Oct 17 18:27:24 proxLB proxlb[5099]: ProxLB: Info: [daemon]: Running in daemon mode. Next run in 24 hours.