To demonstrate the improvement, I run read only traffic on already prefilled datastore "debug populate 10000000 key 1000".
The traffic consists of 100% miss rate in order to zoom in into the flow handled by this pr, which is - looking up for a key
in the dashtable.
For the same reason, I used pipelining - to reduce the impact of networking CPU on the server side, and to make the workload more intensive
on memory.
This improvement:
Reduces the running time by 12% (or increased the avg QPS by 13%)
To demonstrate the improvement, I run read only traffic on already prefilled datastore "debug populate 10000000 key 1000". The traffic consists of 100% miss rate in order to zoom in into the flow handled by this pr, which is - looking up for a key in the dashtable.
For the same reason, I used pipelining - to reduce the impact of networking CPU on the server side, and to make the workload more intensive on memory.
This improvement:
Credit for the idea: https://valkey.io/blog/unlock-one-million-rps/
Detailed runs at more detail:
Before this change:
With this change: