Open xSaKeNx opened 1 week ago
Hey @xSaKeNx,
thanks for reporting. Looks like the VM objects couldn't be fetched in the required time where it expects to get all VM information within 5 seconds and then times out.
How many VMs are you running and how is your cluster performing? ProxLB needs to fetch the required information from for all VMs to process the next steps and calculations. Sure, the timeout could be increased but this would probably result in issues when migrating the VMs.
Can you please share more information regarding the cluster's utilization and the VM count?
Thanks, gyptazy
Hey thank you for your quick response, VM count is about 103 - 5 Offline, 8 Templates and 8 runnning LXCs. Obviously I've filtered it with ignoring VMs but i guess thats not changing the fetch time. 5 Nodes all similar
CPU 6% of 640 CPU(s) Memory 47% 2.30 TiB of 4.90 TiB Storage 57% 77.18 TiB of 135.04 TiB
Having no performance issues whatsoever no problems with response time or loading etc. Maybe changing the permissions to only view assigned resource pool vms could help?
Hey @xSaKeNx,
that's really strange but in your attached logs you can see that upstream libraries are raising this issue and the timeouts.
I just added a new feature with PR #92 which is also already merged into main
and makes the timeout configurable. The default is now 10 seconds and can be set in the config.
Would be great if you could give it a try.
Edit:
Having no performance issues whatsoever no problems with response time or loading etc. Maybe changing the permissions to only view assigned resource pool vms could help?
Just saw this - what kind of permissions are currently granted? ProxLB requires the permissions according to https://github.com/gyptazy/ProxLB/blob/main/docs/02_Configuration.md#required-roles. But if this wouldn't fit - it would throw a permission issue and not run into a timeout-
Thanks, gyptazy
Hey yes I have already seen it and had my permissions set accordingly even full permission everywhere. I also tried giving permission to \Nodes\ and to a defined Pool \pool\testpool\ to get less data didnt help either. Unfortunately even after commit i tried it out and still getting the timeout after 5 seconds so its somehow not working or not setting the timeout on the correct path. Error stays the same as above (urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='ipv6dns', port=8006): Read timed out. (read timeout=5)
Hey yes I have already seen it and had my permissions set accordingly even full permission everywhere. I also tried giving permission to \Nodes\ and to a defined Pool \pool\testpool\ to get less data didnt help either. Unfortunately even after commit i tried it out and still getting the timeout after 5 seconds so its somehow not working or not setting the timeout on the correct path. Error stays the same as above (urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='ipv6dns', port=8006): Read timed out. (read timeout=5)
Are you sure using the new version?
Can you please share the outputs of:
grep __version__ /bin/proxlb
nl /bin/proxlb | grep 281
where the last one should return:
281 api_object = proxmoxer.ProxmoxAPI(proxmox_api_host, user=proxmox_api_user, password=proxmox_api_pass, verify_ssl=proxmox_api_ssl_v, timeout=int(proxmox_api_timeout))
hmm well i checked the code with the provided commit but seems like im not on it even trying to get to the release 1.0.4 branch i only get this output version = "1.0.3b" 281 sys.exit(2)
Oh, then you're even running on an older beta of 1.0.3. Where did you obtain this or how did you install it?
from git just a few hours ago - git clone. Ive also tried Downloading Zip but seems like the same version.
Hm, I have no idea what you're doing there. It should look like this when checking this out freshly:
% git clone https://github.com/gyptazy/ProxLB.git && grep __version__ ProxLB/proxlb
Cloning into 'ProxLB'...
remote: Enumerating objects: 423, done.
remote: Counting objects: 100% (47/47), done.
remote: Compressing objects: 100% (46/46), done.
remote: Total 423 (delta 4), reused 1 (delta 1), pack-reused 376 (from 1)
Receiving objects: 100% (423/423), 167.23 KiB | 2.29 MiB/s, done.
Resolving deltas: 100% (228/228), done.
__version__ = "1.0.4"
I'm wondering how you even came to a beta version. Your currently used 1.0.3b is old and buggy. Guess, the current 1.0.4 solves your issues.
Ive got no idea but now atleast I got the right one. showing exactly your output but still getting read timeout after 5 seconds even setting it to 30
Cannot get it to work. its getting the Data from vms and Node but then ### fails (same error in dry run and normal run):
This is my proxlb.conf - nothing special only added Comments for easier changes