pires / go-proxyproto

A Go library implementation of the PROXY protocol, versions 1 and 2.
Apache License 2.0
478 stars 107 forks source link

Implement io.ReaderFrom/WriterTo for Conn #68

Closed databus23 closed 3 years ago

databus23 commented 3 years ago

This change increases performance when proxying wrapped connections using io.Copy. Since go 1.11 copying between tcp connections uses the splice system call on linux yielding considerable performance improvments. See: https://golang.org/doc/go1.11#net

Signed-off-by: Fabian Ruff fabian.ruff@sap.com

coveralls commented 3 years ago

Coverage Status

Coverage decreased (-0.05%) to 94.177% when pulling ce594191722666b34cbe56f50e960fc0ed8e7996 on databus23:readerfrom-writerto into fff0abf66729bfbb17e1e0f9131c43644d576225 on pires:main.

pires commented 3 years ago

Thank you, Fabian. Can you, please, implement example tests so to have code others can learn from while keeping up with current code coverage?

pires commented 3 years ago

Maybe even benchmark tests to measure the actual performance gains?

databus23 commented 3 years ago

@pires I added some tests to retain code coverage. As requested I also added a simple benchmark for the tcp proxy use case I'm seeking to optimise.

> go test -run=XXX -bench=Bench -count 5 -benchmem > old.txt
> go test -run=XXX -bench=Bench -count 5 -benchmem > new.txt
> benchstat old.txt new.txt

name              old time/op    new time/op    delta
TCPProxy16KB-8       458µs ± 4%     454µs ±11%     ~     (p=0.690 n=5+5)
TCPProxy32KB-8       465µs ± 3%     468µs ±15%     ~     (p=0.690 n=5+5)
TCPProxy64KB-8       505µs ± 2%     464µs ± 8%   -8.16%  (p=0.016 n=5+5)
TCPProxy128KB-8     1.11ms ±52%    0.54ms ± 6%  -50.91%  (p=0.008 n=5+5)
TCPProxy256KB-8      823µs ± 5%     628µs ±10%  -23.74%  (p=0.016 n=4+5)
TCPProxy512KB-8     1.17ms ± 9%    0.79ms ± 9%  -32.99%  (p=0.008 n=5+5)
TCPProxy1024KB-8    1.89ms ±19%    1.11ms ± 5%  -41.15%  (p=0.008 n=5+5)
TCPProxy2048KB-8    2.86ms ± 7%    1.67ms ± 6%  -41.46%  (p=0.008 n=5+5)

name              old alloc/op   new alloc/op   delta
TCPProxy16KB-8      72.2kB ± 0%     6.5kB ± 0%  -90.98%  (p=0.008 n=5+5)
TCPProxy32KB-8      72.2kB ± 0%     6.5kB ± 0%  -90.98%  (p=0.008 n=5+5)
TCPProxy64KB-8      72.2kB ± 0%     6.5kB ± 0%  -90.98%  (p=0.008 n=5+5)
TCPProxy128KB-8     72.2kB ± 0%     6.5kB ± 0%  -90.98%  (p=0.008 n=5+5)
TCPProxy256KB-8     72.2kB ± 0%     6.5kB ± 0%  -90.97%  (p=0.008 n=5+5)
TCPProxy512KB-8     72.2kB ± 0%     6.5kB ± 0%  -90.97%  (p=0.008 n=5+5)
TCPProxy1024KB-8    72.2kB ± 0%     6.5kB ± 0%  -90.96%  (p=0.008 n=5+5)
TCPProxy2048KB-8    72.2kB ± 0%     6.5kB ± 0%  -90.97%  (p=0.008 n=5+5)

name              old allocs/op  new allocs/op  delta
TCPProxy16KB-8        65.0 ± 0%      61.0 ± 0%   -6.15%  (p=0.008 n=5+5)
TCPProxy32KB-8        65.0 ± 0%      61.0 ± 0%   -6.15%  (p=0.008 n=5+5)
TCPProxy64KB-8        65.0 ± 0%      61.0 ± 0%   -6.15%  (p=0.008 n=5+5)
TCPProxy128KB-8       65.0 ± 0%      61.0 ± 0%   -6.15%  (p=0.008 n=5+5)
TCPProxy256KB-8       65.0 ± 0%      61.0 ± 0%   -6.15%  (p=0.008 n=5+5)
TCPProxy512KB-8       65.0 ± 0%      61.0 ± 0%   -6.15%  (p=0.008 n=5+5)
TCPProxy1024KB-8      65.0 ± 0%      61.0 ± 0%   -6.15%  (p=0.008 n=5+5)
TCPProxy2048KB-8      65.0 ± 0%      61.0 ± 0%   -6.15%  (p=0.008 n=5+5)

I did those benchmarks in docker on macOS, so there is a VM involved which might introduce some noise. The results I'm getting are pretty consistent though and are in line with the originally reported gains when the splice optimisation was introduced: https://github.com/golang/go/issues/10948#issuecomment-105753908

pires commented 3 years ago

Thanks a lot!!!