airshipit / treasuremap

Reference Airship manifests, CICD, and reference architecture.
http://openstack.org
Apache License 2.0
52 stars 39 forks source link

Ping Latency Checkpoint #90

Open nagajagan opened 3 years ago

nagajagan commented 3 years ago

Based on my investigations, one of the main issues I see with Airship openstack is the lack of tuning regarding the OVS, Openstack and Calico processes. I have observed many of these processes running on CPU reserved for virtual machines. It is important to note that most of these processes are actually related to networking. This is a wider problem that should be submitted to the Airship community.

My investigations have mainly revealed that a single process is basically responsible of 95% of the ping delays. This is the "neutron-sriov-nic-agent". This process, most of the times runs, on CPU 4 or 5. The virtual machine associated with this CPU is always getting hit very hard. A simple procedure to change the CPU affinity of this process improves the ping delays by 95%. No restart is needed. This is a runtime adjustment.

See the example summary (full printouts in attachment):

As a short term solution, the next step would be to implement a startup script to fix this issue automatically when the neutron-sriov-nic-agent appears in the system. A long term solution would be a correction from Airship to insure no operating system based processes or Airship processes run on virtual machine reserved CPUs.

If you have any questions regarding this document, let me know. courtesy: Claude

nagajagan commented 3 years ago

att-5gc-irq-investigations.txt att-5gc-irq-investigations-example.txt

nagajagan commented 3 years ago

Not sure how this issues is closed by me, its a still open issues issues there is no progress on this.