Closed xavidpr4 closed 6 years ago
Looks like there is a limitation around that code: https://github.com/GNS3/gns3-server/blob/e74e66b20363766c60cd46e044005a06cb340393/gns3server/handlers/api/compute/project_handler.py#L169
We will try to fix that soon.
The refresh time was set at 5 seconds. I changed that to 1 second, this should help but there is likely a design issue somewhere.
Ref: https://github.com/GNS3/gns3-server/commit/9d9dc037d8accbe0e1aedf973fba3f7ed3153e6d
I'm using only one remote server as a GNS3 VM and it's CPU display is flaky as well, even after the change to one second. It mainly shows 0%, 50% and 100% as the load.
I see, that three notification stream are active on the remote server, one locally on the remote server and two from my local controller. I assume, that one is for the remote server and one for the GNS3VM, even when they are the same in my setup.
The first notification stream is asking every second for CPU and memory information. The second and third stream are then asking for these information very fast after the first request, mostly within some few milliseconds. cpu_percent() (https://psutil.readthedocs.io/en/latest/#psutil.cpu_percent) is very inaccurate, when the interval between two calls is very low.
I've created a PR, that ensures a minimum interval between two psutil.cpu_percent() calls. If CPU and memory are requested faster than this interval, the latest result is returned. First I tested that with an interval of one second. That works, but results in a CPU display, that was too volatile for me. I then changed the minimum interval to two seconds, what's much smoother.
Created PR https://github.com/GNS3/gns3-server/pull/1295.
Let's run some more tests with more servers and then we can close that issue if everything is fine.
Just tested with 5 servers. I created 4 clones of the GNS3VM, gave them unique IP addresses and added them as remote servers. To be able to better see the summary updates I changed back the interval to 5 seconds (only for this test). I don't see the effect @xavidpr4 noticed. The CPU values for all servers all updated within that refresh interval.
Thanks for saving us time! Let's close this.
I have a GNS3 cluster environment with one GNS3 vm that acts as an orchestrator (the main server) who has all the projects, and five gns3 servers (all of them are intel nuc) that handle the computing of a very large topology of IOSvL2 (about 170 nodes).
All works well and communication between nodes works fine, sometimes with delays of 3-4 ms between an IOSvL2 of a node and a second IOSvL2 of another node, which in my opinion is good latency.
However, in the servers summary, the cpu usage seems that is not very accurate, and continuosly flaps between fixed values of 0%,25%,33%,50%,75%,100%.
I'm not sure how works in the background the calculation of these values, but at the begining, when i have only a node (the gns3 vm and only one intel nuc as compute node) the cpu summary were much more accurated.
Now looks like only checks cpu 1 time every 5 seconds, so if i have 5 nodes (plus gns3 vm), it takes 30 seconds to check the first node again.
With 2 nodes: 0 seconds -> 1st check -> gns3 vm (main server) 5 seconds -> 2nd check -> node 1 10 seconds -> 3rd check -> gns3 vm (main server) 15 seconds -> 4th check -> node 1 20 seconds -> 5th check -> gns3 vm (main server) 25 seconds -> s6th check -> node 1 ...
With 5 nodes 0 seconds -> 1st check -> gns3 vm (main server) 5 seconds -> 2nd check -> node 1 10 seconds -> 3rd check -> node 2 15 seconds -> 4th check -> node 3 20 seconds -> 5th check -> node 4 25 seconds -> s6th check -> node 5 30 seconds -> 7th check -> gns3 vm (main server) 35 seconds -> 8th check -> node 1 ...
As I said before, i don't know if the reason why the servers summary tab does not show the real cpu usage is due my theory about calculation, however i'm sure that is not working fine. Anyone has experiencied something similar?
Thanks in advance.