Open rukai opened 1 year ago
Yes we should!!
On Thu, Jan 5, 2023 at 11:37 PM Lucas Kent @.***> wrote:
I was reading a blog post during my holiday and realized that we set tcp nodelay for client<->shotover connections but we dont set it for shotover <-> DB connections. Should we?
— Reply to this email directly, view it on GitHub https://github.com/shotover/shotover-proxy/issues/985, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACG2BNZR5RPMTZJUJNC6MP3WQ2P5PANCNFSM6AAAAAATRZOARI . You are receiving this because you are subscribed to this thread.Message ID: @.***>
I wrote a benchmark that puts shotover as the bottleneck between the client and database. I observed that:
So seems we should just leave it as is. I am quite puzzled by these results though.
What does it do to latency though?
On Wed, Jan 25, 2023 at 8:06 PM Lucas Kent @.***> wrote:
I wrote a benchmark that puts shotover as the bottleneck between the client and database. I observed that: Removing the client<->shotover nodelay halves our throughput. Adding a shotover<->DB nodelay drops our throughput by about ~10%
So seems we should just leave it as is. I am quite puzzled by these results though.
— Reply to this email directly, view it on GitHub https://github.com/shotover/shotover-proxy/issues/985#issuecomment-1403181251, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACG2BN5JP5RHUO5VYYRESADWUDGH3ANCNFSM6AAAAAATRZOARI . You are receiving this because you commented.Message ID: @.***>
Tried enabling outgoing tcpnodelay again today, it seems to improve cassandra latency and regress redis latency. I set a cap to operations per second to avoid the noise I have not yet eliminated from windsock.
redis (left nodelay=false, right nodelay=true)
cassandra (left nodelay=false, right nodelay=true)
We should retry this since the recent performance improvements
I got nagle'd on a hobby project just now and it gave me the insight I needed to know how to proceed here.
We wont see performance improvement on our current integration benchmarks because they are all stress tests. We are stuffing as much data through 1 or 100 connections as possible. This wont hit issues with tcp_nodelay enabled since constantly pushing more data in will force the packet out into the network and no delay will occur.
However if we set up a benchmark where we send 1 request every second on a single connection, the latency results of that benchmark should demonstrate tcp_nodelay causing us issues!
This morning I spent some time looking into this. I set up a benchmark as I described above with 1 small request per second. However this did not reproduce the issues of nagle's algorithm at all. Whether i had tcp_nodelay enabled did not affect this benchmark.
And on closer analysis:
4 * 0.15 = 0.6
so we can reasonably conclude that all of our time is being taken up by the network hops.Nagle's algorithm states that small packets should be delayed until we get an ACK back from the server. So if nagle's was causing issues here we should be seeing 2 extra network hops of latency, giving us a latency of 0.9 ms
So all in all I'm really quite confused why we arent seeing any issues from nagle's algorithm being enabled. :confused:
I was reading a blog post during my holiday and realized that we set tcp nodelay for client<->shotover connections but we dont set it for shotover <-> DB connections. Should we?