pkg / sftp

SFTP support for the go.crypto/ssh package
BSD 2-Clause "Simplified" License
1.52k stars 380 forks source link

A simpler bytes buffer pool #417

Closed puellanivis closed 3 years ago

puellanivis commented 3 years ago

Building on https://github.com/pkg/sftp/pull/415 I did a bit of poking. I found:

puellanivis commented 3 years ago

The allocations go from ~2390 to ~2290. Looks like sync.Pool has a number of internal allocations that happen… might make sense for a longer-lived set of buffers, but for us, dumb is better.

drakkan commented 3 years ago

Hi,

I am very busy for the next few weeks and don't have time to test, however this buffer pool could also replace the server side allocator

greatroar commented 3 years ago

Impressive:

name                       old time/op    new time/op    delta
WriteTo1k-8                  12.7ms ± 2%    12.0ms ± 3%   -5.21%  (p=0.000 n=10+9)
WriteTo16k-8                 12.6ms ± 3%    12.0ms ± 3%   -4.41%  (p=0.000 n=10+10)
WriteTo32k-8                 12.6ms ± 3%    12.1ms ± 4%   -4.19%  (p=0.000 n=10+10)
WriteTo128k-8                12.5ms ± 3%    12.1ms ± 2%   -2.96%  (p=0.003 n=10+10)
WriteTo512k-8                12.6ms ± 2%    12.1ms ± 1%   -4.35%  (p=0.000 n=10+8)
WriteTo1MiB-8                12.6ms ± 3%    12.1ms ± 3%   -3.85%  (p=0.000 n=10+10)
WriteTo4MiB-8                12.6ms ± 4%    12.1ms ± 3%   -4.25%  (p=0.000 n=10+9)
WriteTo4MiBDelay10Msec-8      109ms ± 1%     109ms ± 1%     ~     (p=0.739 n=10+10)
WriteTo4MiBDelay50Msec-8      508ms ± 0%     509ms ± 0%     ~     (p=0.133 n=9+10)
WriteTo4MiBDelay150Msec-8     1.51s ± 0%     1.51s ± 0%     ~     (p=0.113 n=9+10)

name                       old speed      new speed      delta
WriteTo1k-8                 828MB/s ± 2%   870MB/s ± 2%   +5.10%  (p=0.000 n=10+8)
WriteTo16k-8                833MB/s ± 3%   872MB/s ± 3%   +4.61%  (p=0.000 n=10+10)
WriteTo32k-8                828MB/s ± 2%   868MB/s ± 4%   +4.80%  (p=0.000 n=9+10)
WriteTo128k-8               839MB/s ± 3%   865MB/s ± 2%   +3.05%  (p=0.003 n=10+10)
WriteTo512k-8               831MB/s ± 2%   869MB/s ± 1%   +4.53%  (p=0.000 n=10+8)
WriteTo1MiB-8               832MB/s ± 3%   865MB/s ± 3%   +4.00%  (p=0.000 n=10+10)
WriteTo4MiB-8               830MB/s ± 4%   867MB/s ± 3%   +4.42%  (p=0.000 n=10+9)
WriteTo4MiBDelay10Msec-8   96.5MB/s ± 1%  96.5MB/s ± 1%     ~     (p=0.984 n=10+9)
WriteTo4MiBDelay50Msec-8   20.6MB/s ± 0%  20.6MB/s ± 0%     ~     (p=0.118 n=9+10)
WriteTo4MiBDelay150Msec-8  6.94MB/s ± 0%  6.94MB/s ± 0%     ~     (all equal)

name                       old alloc/op   new alloc/op   delta
WriteTo1k-8                  15.8MB ± 0%    15.7MB ± 0%   -1.16%  (p=0.000 n=10+10)
WriteTo16k-8                 15.9MB ± 0%    15.7MB ± 0%   -1.16%  (p=0.000 n=10+6)
WriteTo32k-8                 15.8MB ± 0%    15.7MB ± 0%   -1.18%  (p=0.000 n=10+10)
WriteTo128k-8                15.8MB ± 0%    15.7MB ± 0%   -1.13%  (p=0.000 n=10+10)
WriteTo512k-8                15.8MB ± 0%    15.7MB ± 0%   -1.16%  (p=0.000 n=8+10)
WriteTo1MiB-8                15.8MB ± 0%    15.7MB ± 0%   -1.17%  (p=0.000 n=10+8)
WriteTo4MiB-8                15.9MB ± 0%    15.7MB ± 0%   -1.17%  (p=0.000 n=9+10)
WriteTo4MiBDelay10Msec-8     18.8MB ± 0%    18.7MB ± 0%   -0.97%  (p=0.000 n=10+10)
WriteTo4MiBDelay50Msec-8     32.1MB ± 0%    31.9MB ± 0%   -0.56%  (p=0.000 n=10+10)
WriteTo4MiBDelay150Msec-8    48.8MB ± 0%    48.5MB ± 0%   -0.50%  (p=0.000 n=7+10)

name                       old allocs/op  new allocs/op  delta
WriteTo1k-8                   2.71k ± 0%     2.29k ± 0%  -15.49%  (p=0.000 n=10+10)
WriteTo16k-8                  2.71k ± 0%     2.29k ± 0%  -15.48%  (p=0.000 n=10+10)
WriteTo32k-8                  2.71k ± 0%     2.29k ± 0%  -15.44%  (p=0.000 n=10+10)
WriteTo128k-8                 2.71k ± 0%     2.29k ± 0%  -15.52%  (p=0.000 n=10+10)
WriteTo512k-8                 2.71k ± 0%     2.29k ± 0%  -15.49%  (p=0.000 n=10+9)
WriteTo1MiB-8                 2.71k ± 0%     2.29k ± 0%  -15.50%  (p=0.000 n=10+10)
WriteTo4MiB-8                 2.71k ± 0%     2.29k ± 0%  -15.45%  (p=0.000 n=9+10)
WriteTo4MiBDelay10Msec-8      3.11k ± 0%     2.68k ± 0%  -13.67%  (p=0.000 n=9+10)
WriteTo4MiBDelay50Msec-8      3.13k ± 1%     2.71k ± 1%  -13.56%  (p=0.000 n=9+10)
WriteTo4MiBDelay150Msec-8     3.14k ± 0%     2.71k ± 1%  -13.64%  (p=0.000 n=8+10)

That's compared to master.

puellanivis commented 3 years ago

Hi,

I am very busy for the next few weeks and don't have time to test, however this buffer pool could also replace the server side allocator

Yeah, I took a look for more sync.Pool but I didn’t find any right off the bat. But we can look into bringing it into the allocator as a separate PR.