cloud-bulldozer / e2e-benchmarking

Performance Tests for end Platforms
Apache License 2.0
40 stars 72 forks source link

new workload router-perf-v3 #406

Open mukrishn opened 2 years ago

mukrishn commented 2 years ago

We run router-perf-v2 workload for data plane performance test and we need some enhancement to address a different behavior on managed-service OCP. This workload creates pods and generates traffic within cluster (from hostnet pod) and on AWS Openshift http traffics from client get routed to an external Loadbalancer VIP and route back to the cluster. But other platforms follow completely different approach, GCP/Azure route client traffics to their corresponding service clusterIP using IPtables DNAT policy, so the client traffic will not exit out of a cluster and reach server.

Probably its worth spending some effort on a new workload router-perf-v3(or add-on to v2) to run the client from an external source to replicate real-world scenario, however the consistency of results are affected due to known external variable(LB, client resources, client location, cloud variability), this way it follows same behavior on all platform and easier to compare results between them.

mukrishn commented 1 year ago

Recently, we have been noticing inconsistency in the results on running router-perf-v2 on ROSA/AWS. Its high time to redesign this or add a new HTTP benchmark tool, slack convo

mukrishn commented 1 year ago

The inconsistency in latency is due to an issue in mb tool reporting logic, it is recording a wrong value when the response is non-200 with socket_read(): connection error, latency calculation is delta between request and response timestamp but during this case it is calculating with wrong timestamp(0) and it is affecting the P99, P90 and avg latency.