Open denis-tingaikin opened 3 months ago
It is not easy to define the latency criteria because we just have application level diagrams:
The first spikes came after ~3,5 hours. Then we could see that as time progressed the spikes appear more and more frequently. So, something similar to the stability criteria would be good here as well. I mean latency should keep under a defined limit for the whole test period.
At this time, we can use this picture as an acceptable latency level.
I still don't like spikes here, but they can be handled and improved in the next releases.
@edwarnicke I think in ideal, our latency should be 0–50 ms; could you say based on your experience the ideal latency level for NSM?
As the diagram shows the latency is around and under 50 ms most of the times. Now we reached a point when the system can survive the infrequent latency spikes without disconnections and significant traffic loss, which is good. If we can stabilize the system in this situation and stop the memory increase then I think it is an acceptable status for releasing and we can work on improvements in the next releases.
v1.13.1-rc.3
datapath latency picture
v1.13.1-rc.3 memory usage
forwawrder memory consumption in high load <= v1.13.2:
forwarder memory consumption in high load tinden/cmd-forwarder-vpp:v1.13.2-fix.3
acceptable memory diff for the nsmgr after 27h of running.
To make the process of release delivery more clear, transparent, safe, and stable, we should define the main release quality targets and the definition of done.
At this moment, it could be