GNS3 / gns3-gui

GNS3 Graphical Network Simulator
http://www.gns3.com
GNU General Public License v3.0
2.17k stars 436 forks source link

servers summary #2262

Closed xavidpr4 closed 6 years ago

xavidpr4 commented 7 years ago

I have a GNS3 cluster environment with one GNS3 vm that acts as an orchestrator (the main server) who has all the projects, and five gns3 servers (all of them are intel nuc) that handle the computing of a very large topology of IOSvL2 (about 170 nodes).

All works well and communication between nodes works fine, sometimes with delays of 3-4 ms between an IOSvL2 of a node and a second IOSvL2 of another node, which in my opinion is good latency.

However, in the servers summary, the cpu usage seems that is not very accurate, and continuosly flaps between fixed values of 0%,25%,33%,50%,75%,100%.

I'm not sure how works in the background the calculation of these values, but at the begining, when i have only a node (the gns3 vm and only one intel nuc as compute node) the cpu summary were much more accurated.

Now looks like only checks cpu 1 time every 5 seconds, so if i have 5 nodes (plus gns3 vm), it takes 30 seconds to check the first node again.

With 2 nodes: 0 seconds -> 1st check -> gns3 vm (main server) 5 seconds -> 2nd check -> node 1 10 seconds -> 3rd check -> gns3 vm (main server) 15 seconds -> 4th check -> node 1 20 seconds -> 5th check -> gns3 vm (main server) 25 seconds -> s6th check -> node 1 ...

With 5 nodes 0 seconds -> 1st check -> gns3 vm (main server) 5 seconds -> 2nd check -> node 1 10 seconds -> 3rd check -> node 2 15 seconds -> 4th check -> node 3 20 seconds -> 5th check -> node 4 25 seconds -> s6th check -> node 5 30 seconds -> 7th check -> gns3 vm (main server) 35 seconds -> 8th check -> node 1 ...

gns3

As I said before, i don't know if the reason why the servers summary tab does not show the real cpu usage is due my theory about calculation, however i'm sure that is not working fine. Anyone has experiencied something similar?

Thanks in advance.

grossmj commented 7 years ago

Looks like there is a limitation around that code: https://github.com/GNS3/gns3-server/blob/e74e66b20363766c60cd46e044005a06cb340393/gns3server/handlers/api/compute/project_handler.py#L169

We will try to fix that soon.

grossmj commented 6 years ago

The refresh time was set at 5 seconds. I changed that to 1 second, this should help but there is likely a design issue somewhere.

Ref: https://github.com/GNS3/gns3-server/commit/9d9dc037d8accbe0e1aedf973fba3f7ed3153e6d

ghost commented 6 years ago

I'm using only one remote server as a GNS3 VM and it's CPU display is flaky as well, even after the change to one second. It mainly shows 0%, 50% and 100% as the load.

I see, that three notification stream are active on the remote server, one locally on the remote server and two from my local controller. I assume, that one is for the remote server and one for the GNS3VM, even when they are the same in my setup.

The first notification stream is asking every second for CPU and memory information. The second and third stream are then asking for these information very fast after the first request, mostly within some few milliseconds. cpu_percent() (https://psutil.readthedocs.io/en/latest/#psutil.cpu_percent) is very inaccurate, when the interval between two calls is very low.

I've created a PR, that ensures a minimum interval between two psutil.cpu_percent() calls. If CPU and memory are requested faster than this interval, the latest result is returned. First I tested that with an interval of one second. That works, but results in a CPU display, that was too volatile for me. I then changed the minimum interval to two seconds, what's much smoother.

ghost commented 6 years ago

Created PR https://github.com/GNS3/gns3-server/pull/1295.

grossmj commented 6 years ago

Let's run some more tests with more servers and then we can close that issue if everything is fine.

ghost commented 6 years ago

Just tested with 5 servers. I created 4 clones of the GNS3VM, gave them unique IP addresses and added them as remote servers. To be able to better see the summary updates I changed back the interval to 5 seconds (only for this test). I don't see the effect @xavidpr4 noticed. The CPU values for all servers all updated within that refresh interval.

grossmj commented 6 years ago

Thanks for saving us time! Let's close this.