networkservicemesh / site

Network Service Mesh website
https://www.networkservicemesh.io/
Apache License 2.0
6 stars 29 forks source link

Add release quality targets on the site #281

Open denis-tingaikin opened 3 months ago

denis-tingaikin commented 3 months ago

To make the process of release delivery more clear, transparent, safe, and stable, we should define the main release quality targets and the definition of done.

At this moment, it could be

  1. NSM should be stable for 24 hours in a high-load scenario. (This means no mem leaks.)
  2. What are the latency criteria?
  3. TODO: Add other criteria.
szvincze commented 3 months ago

It is not easy to define the latency criteria because we just have application level diagrams: NSM-latency-spikes

The first spikes came after ~3,5 hours. Then we could see that as time progressed the spikes appear more and more frequently. So, something similar to the stability criteria would be good here as well. I mean latency should keep under a defined limit for the whole test period.

denis-tingaikin commented 3 months ago

At this time, we can use this picture as an acceptable latency level.

image

I still don't like spikes here, but they can be handled and improved in the next releases. 

@edwarnicke I think in ideal, our latency should be 0–50 ms; could you say based on your experience the ideal latency level for NSM?

szvincze commented 3 months ago

As the diagram shows the latency is around and under 50 ms most of the times. Now we reached a point when the system can survive the infrequent latency spikes without disconnections and significant traffic loss, which is good. If we can stabilize the system in this situation and stop the memory increase then I think it is an acceptable status for releasing and we can work on improvements in the next releases.

denis-tingaikin commented 3 months ago

v1.13.1-rc.3 datapath latency picture image

denis-tingaikin commented 3 months ago

v1.13.1-rc.3 memory usage

image

denis-tingaikin commented 1 month ago

forwawrder memory consumption in high load <= v1.13.2:

image

forwarder memory consumption in high load tinden/cmd-forwarder-vpp:v1.13.2-fix.3

image

denis-tingaikin commented 1 week ago

acceptable memory diff for the nsmgr after 27h of running.

image