pkg / sftp

SFTP support for the go.crypto/ssh package
BSD 2-Clause "Simplified" License
1.52k stars 380 forks source link

Avoid deadlock between Close when offset-modifying operation is still pending #604

Closed puellanivis closed 1 week ago

puellanivis commented 1 week ago

There are a few fixes mixed in with solely addressing the deadlock. I still need to test this, but I needed a branch to test.

The change has to plumb a channel through to clientConn where it it performs the actual writing of the request to the outbound writer. As a result, I’m unsure how feasible it would be to get the same sort of logic rolled out on v1. I know where I would start, and that’s the new timebox for after when I test this one. If it becomes a giant mess of tangles, I’ll back out, and we’ll accept that the previous behavior “just happened to work, but was never guaranteed to work.”

fixes: #603

puellanivis commented 1 week ago

Compared to current dev-v2:

goos: linux
goarch: amd64
pkg: github.com/pkg/sftp/v2/localfs
cpu: 13th Gen Intel(R) Core(TM) i9-13900KF
                 │   old.log    │               new.log               │
                 │    sec/op    │    sec/op     vs base               │
WriteTo1KiB-32     123.3µ ±  5%   118.0µ ±  2%  -4.25% (p=0.000 n=10)
WriteTo1MiB-32     333.6µ ±  1%   327.1µ ±  2%  -1.96% (p=0.011 n=10)
WriteTo4MiB-32     941.8µ ±  7%   881.2µ ±  1%  -6.44% (p=0.043 n=10)
WriteTo10MiB-32    2.592m ±  5%   2.466m ±  4%  -4.86% (p=0.002 n=10)
WriteTo64MiB-32    17.66m ±  2%   17.23m ±  4%  -2.46% (p=0.023 n=10)
ReadFrom1KiB-32    205.3µ ± 26%   185.9µ ± 17%       ~ (p=0.631 n=10)
ReadFrom1MiB-32    968.1µ ±  3%   949.0µ ±  2%       ~ (p=0.063 n=10)
ReadFrom4MiB-32    3.302m ±  2%   3.280m ±  1%       ~ (p=0.105 n=10)
ReadFrom10MiB-32   8.066m ±  5%   7.989m ±  3%       ~ (p=0.123 n=10)
ReadFrom64MiB-32   42.75m ±  2%   41.73m ±  3%       ~ (p=0.063 n=10)
geomean            1.821m         1.756m        -3.57%

                 │    old.log    │               new.log                │
                 │      B/s      │      B/s       vs base               │
WriteTo1KiB-32     7.920Mi ±  5%   8.273Mi ±  2%  +4.46% (p=0.000 n=10)
WriteTo1MiB-32     2.927Gi ±  1%   2.986Gi ±  2%  +2.00% (p=0.011 n=10)
WriteTo4MiB-32     4.148Gi ±  8%   4.433Gi ±  1%  +6.87% (p=0.043 n=10)
WriteTo10MiB-32    3.769Gi ±  5%   3.960Gi ±  4%  +5.07% (p=0.002 n=10)
WriteTo64MiB-32    3.539Gi ±  2%   3.628Gi ±  4%  +2.52% (p=0.023 n=10)
ReadFrom1KiB-32    4.759Mi ± 35%   5.260Mi ± 17%       ~ (p=0.617 n=10)
ReadFrom1MiB-32    1.009Gi ±  3%   1.029Gi ±  2%       ~ (p=0.063 n=10)
ReadFrom4MiB-32    1.183Gi ±  2%   1.191Gi ±  2%       ~ (p=0.105 n=10)
ReadFrom10MiB-32   1.211Gi ±  5%   1.222Gi ±  3%       ~ (p=0.123 n=10)
ReadFrom64MiB-32   1.462Gi ±  2%   1.498Gi ±  3%       ~ (p=0.063 n=10)
geomean            659.6Mi         684.1Mi        +3.72%

                 │    old.log    │               new.log                │
                 │     B/op      │     B/op       vs base               │
WriteTo1KiB-32     8.157Ki ±  1%   8.283Ki ±  1%  +1.54% (p=0.005 n=10)
WriteTo1MiB-32     42.36Ki ±  0%   42.68Ki ±  1%  +0.75% (p=0.009 n=10)
WriteTo4MiB-32     43.73Ki ±  2%   43.66Ki ±  2%       ~ (p=0.631 n=10)
WriteTo10MiB-32    46.59Ki ±  3%   46.18Ki ±  5%       ~ (p=0.971 n=10)
WriteTo64MiB-32    55.25Ki ± 31%   62.49Ki ± 17%       ~ (p=0.218 n=10)
ReadFrom1KiB-32    3.578Ki ±  0%   3.734Ki ±  0%  +4.37% (p=0.000 n=10)
ReadFrom1MiB-32    3.821Ki ±  0%   3.977Ki ±  0%  +4.06% (p=0.000 n=10)
ReadFrom4MiB-32    30.95Ki ±  4%   30.34Ki ±  5%       ~ (p=0.393 n=10)
ReadFrom10MiB-32   46.45Ki ±  4%   47.24Ki ±  2%       ~ (p=0.256 n=10)
ReadFrom64MiB-32   65.09Ki ±  2%   65.27Ki ±  2%       ~ (p=0.123 n=10)
geomean            23.43Ki         23.94Ki        +2.20%

                 │   old.log   │              new.log               │
                 │  allocs/op  │  allocs/op   vs base               │
WriteTo1KiB-32      122.0 ± 1%    125.0 ± 0%  +2.46% (p=0.000 n=10)
WriteTo1MiB-32      134.0 ± 1%    135.0 ± 1%  +0.75% (p=0.000 n=10)
WriteTo4MiB-32      230.0 ± 0%    231.5 ± 0%  +0.65% (p=0.000 n=10)
WriteTo10MiB-32     428.5 ± 0%    430.0 ± 0%  +0.35% (p=0.000 n=10)
WriteTo64MiB-32    2.162k ± 0%   2.164k ± 0%  +0.09% (p=0.000 n=10)
ReadFrom1KiB-32     39.00 ± 0%    41.00 ± 0%  +5.13% (p=0.000 n=10)
ReadFrom1MiB-32     70.00 ± 0%    72.00 ± 0%  +2.86% (p=0.000 n=10)
ReadFrom4MiB-32     215.5 ± 0%    217.0 ± 0%  +0.70% (p=0.001 n=10)
ReadFrom10MiB-32    423.5 ± 1%    426.0 ± 0%  +0.59% (p=0.004 n=10)
ReadFrom64MiB-32   2.168k ± 0%   2.171k ± 0%  +0.14% (p=0.000 n=10)
geomean             267.6         271.2       +1.36%

Windows shows similar improvements:

goos: windows
goarch: amd64
pkg: github.com/pkg/sftp/v2/localfs
cpu: 13th Gen Intel(R) Core(TM) i9-13900KF
                 │   old.log   │               new.log                │
                 │   sec/op    │    sec/op     vs base                │
WriteTo1KiB-32     284.2µ ± 3%   277.3µ ±  1%   -2.42% (p=0.015 n=10)
WriteTo1MiB-32     964.0µ ± 1%   936.2µ ±  1%   -2.88% (p=0.000 n=10)
WriteTo4MiB-32     2.954m ± 6%   2.876m ±  2%   -2.62% (p=0.000 n=10)
WriteTo10MiB-32    7.163m ± 2%   6.954m ±  2%   -2.92% (p=0.000 n=10)
WriteTo64MiB-32    45.01m ± 1%   43.22m ±  1%   -3.97% (p=0.000 n=10)
ReadFrom1KiB-32    262.9µ ± 3%   254.6µ ±  3%   -3.15% (p=0.015 n=10)
ReadFrom1MiB-32    4.815m ± 7%   3.946m ±  6%  -18.04% (p=0.000 n=10)
ReadFrom4MiB-32    3.755m ± 7%   3.634m ± 10%        ~ (p=0.105 n=10)
ReadFrom10MiB-32   7.562m ± 5%   7.401m ±  9%        ~ (p=0.089 n=10)
ReadFrom64MiB-32   47.42m ± 8%   48.01m ±  8%        ~ (p=0.739 n=10)
geomean            3.671m        3.519m         -4.14%

                 │   old.log    │                new.log                │
                 │     B/s      │      B/s       vs base                │
WriteTo1KiB-32     3.438Mi ± 3%   3.519Mi ±  1%   +2.36% (p=0.014 n=10)
WriteTo1MiB-32     1.013Gi ± 1%   1.043Gi ±  1%   +2.97% (p=0.000 n=10)
WriteTo4MiB-32     1.323Gi ± 5%   1.358Gi ±  2%   +2.69% (p=0.000 n=10)
WriteTo10MiB-32    1.363Gi ± 2%   1.404Gi ±  2%   +3.00% (p=0.000 n=10)
WriteTo64MiB-32    1.389Gi ± 1%   1.446Gi ±  1%   +4.14% (p=0.000 n=10)
ReadFrom1KiB-32    3.715Mi ± 3%   3.834Mi ±  2%   +3.21% (p=0.011 n=10)
ReadFrom1MiB-32    207.7Mi ± 7%   253.4Mi ±  6%  +22.01% (p=0.000 n=10)
ReadFrom4MiB-32    1.041Gi ± 7%   1.075Gi ± 11%        ~ (p=0.105 n=10)
ReadFrom10MiB-32   1.291Gi ± 5%   1.319Gi ± 10%        ~ (p=0.089 n=10)
ReadFrom64MiB-32   1.318Gi ± 8%   1.302Gi ±  7%        ~ (p=0.739 n=10)
geomean            327.2Mi        341.3Mi         +4.30%

                 │    old.log    │                new.log                │
                 │     B/op      │     B/op       vs base                │
WriteTo1KiB-32     15.75Ki ±  2%   16.02Ki ±  2%        ~ (p=0.065 n=10)
WriteTo1MiB-32     41.66Ki ±  0%   42.15Ki ±  1%   +1.17% (p=0.015 n=10)
WriteTo4MiB-32     42.76Ki ±  3%   43.31Ki ±  3%        ~ (p=0.353 n=10)
WriteTo10MiB-32    44.68Ki ±  4%   44.45Ki ±  5%        ~ (p=0.971 n=10)
WriteTo64MiB-32    55.40Ki ± 24%   53.43Ki ± 17%        ~ (p=0.853 n=10)
ReadFrom1KiB-32    4.375Ki ±  0%   4.531Ki ±  0%   +3.57% (p=0.000 n=10)
ReadFrom1MiB-32    4.694Ki ±  0%   4.826Ki ±  0%   +2.80% (p=0.000 n=10)
ReadFrom4MiB-32    7.826Ki ± 12%   9.029Ki ± 12%  +15.37% (p=0.001 n=10)
ReadFrom10MiB-32   15.30Ki ± 19%   19.10Ki ± 12%  +24.87% (p=0.000 n=10)
ReadFrom64MiB-32   52.15Ki ± 10%   61.67Ki ±  8%  +18.25% (p=0.000 n=10)
geomean            19.72Ki         20.93Ki         +6.14%

                 │   old.log   │              new.log               │
                 │  allocs/op  │  allocs/op   vs base               │
WriteTo1KiB-32      94.00 ± 1%    95.00 ± 0%  +1.06% (p=0.001 n=10)
WriteTo1MiB-32      142.0 ± 0%    144.0 ± 0%  +1.41% (p=0.000 n=10)
WriteTo4MiB-32      238.0 ± 0%    241.0 ± 0%  +1.26% (p=0.000 n=10)
WriteTo10MiB-32     431.0 ± 0%    433.5 ± 0%  +0.58% (p=0.000 n=10)
WriteTo64MiB-32    2.167k ± 0%   2.170k ± 0%  +0.12% (p=0.000 n=10)
ReadFrom1KiB-32     41.00 ± 0%    43.00 ± 0%  +4.88% (p=0.000 n=10)
ReadFrom1MiB-32     73.00 ± 1%    74.00 ± 0%  +1.37% (p=0.000 n=10)
ReadFrom4MiB-32     177.0 ± 1%    181.0 ± 1%  +2.26% (p=0.000 n=10)
ReadFrom10MiB-32    383.5 ± 1%    391.0 ± 1%  +1.96% (p=0.000 n=10)
ReadFrom64MiB-32   2.157k ± 0%   2.165k ± 0%  +0.37% (p=0.000 n=10)
geomean             257.9         261.8       +1.52%

(This is the same machine WSL vs native windows running from WSL… and the tempdir for windows is actually on a windows drive, so it’s not like it’s having to translate from a Windows //WSL$/... path to access the test file… and, queue the intro to flame war about how bad Windows performance is.)

puellanivis commented 1 week ago

Performance on macOS ARM is pretty much equivalent:

goos: darwin
goarch: arm64
pkg: github.com/pkg/sftp/v2/localfs
cpu: Apple M1 Pro
                 │   old.log   │              new.log               │
                 │   sec/op    │   sec/op     vs base               │
WriteTo1KiB-10     390.4µ ± 2%   394.3µ ± 3%       ~ (p=0.280 n=10)
WriteTo1MiB-10     773.7µ ± 3%   780.7µ ± 2%       ~ (p=0.315 n=10)
WriteTo4MiB-10     1.760m ± 2%   1.790m ± 3%       ~ (p=0.190 n=10)
WriteTo10MiB-10    3.770m ± 1%   3.831m ± 3%  +1.60% (p=0.001 n=10)
WriteTo64MiB-10    21.19m ± 3%   21.00m ± 2%       ~ (p=0.218 n=10)
ReadFrom1KiB-10    141.0µ ± 2%   142.9µ ± 2%       ~ (p=0.353 n=10)
ReadFrom1MiB-10    715.1µ ± 1%   709.6µ ± 2%       ~ (p=0.315 n=10)
ReadFrom4MiB-10    2.474m ± 1%   2.495m ± 1%       ~ (p=0.165 n=10)
ReadFrom10MiB-10   5.110m ± 1%   5.131m ± 2%       ~ (p=0.739 n=10)
ReadFrom64MiB-10   33.35m ± 5%   32.78m ± 4%       ~ (p=0.315 n=10)
geomean            2.117m        2.126m       +0.43%

                 │   old.log    │               new.log               │
                 │     B/s      │     B/s       vs base               │
WriteTo1KiB-10     2.503Mi ± 2%   2.480Mi ± 3%       ~ (p=0.224 n=10)
WriteTo1MiB-10     1.262Gi ± 3%   1.251Gi ± 2%       ~ (p=0.315 n=10)
WriteTo4MiB-10     2.219Gi ± 2%   2.183Gi ± 3%       ~ (p=0.190 n=10)
WriteTo10MiB-10    2.590Gi ± 1%   2.549Gi ± 3%  -1.58% (p=0.001 n=10)
WriteTo64MiB-10    2.949Gi ± 3%   2.976Gi ± 2%       ~ (p=0.218 n=10)
ReadFrom1KiB-10    6.928Mi ± 2%   6.838Mi ± 2%       ~ (p=0.323 n=10)
ReadFrom1MiB-10    1.366Gi ± 1%   1.376Gi ± 2%       ~ (p=0.315 n=10)
ReadFrom4MiB-10    1.579Gi ± 1%   1.566Gi ± 1%       ~ (p=0.165 n=10)
ReadFrom10MiB-10   1.911Gi ± 1%   1.903Gi ± 3%       ~ (p=0.739 n=10)
ReadFrom64MiB-10   1.874Gi ± 5%   1.907Gi ± 4%       ~ (p=0.315 n=10)
geomean            567.5Mi        565.1Mi       -0.42%

                 │    old.log    │               new.log                │
                 │     B/op      │     B/op       vs base               │
WriteTo1KiB-10     12.00Ki ±  4%   12.52Ki ±  2%  +4.35% (p=0.004 n=10)
WriteTo1MiB-10     41.48Ki ±  1%   42.02Ki ±  2%       ~ (p=0.063 n=10)
WriteTo4MiB-10     42.31Ki ±  3%   42.38Ki ±  2%       ~ (p=0.971 n=10)
WriteTo10MiB-10    43.62Ki ±  7%   43.88Ki ±  6%       ~ (p=0.529 n=10)
WriteTo64MiB-10    56.01Ki ± 13%   56.65Ki ± 24%       ~ (p=0.481 n=10)
ReadFrom1KiB-10    3.672Ki ±  0%   3.828Ki ±  0%  +4.26% (p=0.000 n=10)
ReadFrom1MiB-10    3.926Ki ±  0%   4.070Ki ±  0%  +3.67% (p=0.000 n=10)
ReadFrom4MiB-10    5.093Ki ±  6%   5.019Ki ±  5%       ~ (p=0.724 n=10)
ReadFrom10MiB-10   7.941Ki ±  7%   8.606Ki ±  7%  +8.37% (p=0.015 n=10)
ReadFrom64MiB-10   27.58Ki ± 15%   28.09Ki ±  8%       ~ (p=0.436 n=10)
geomean            15.55Ki         15.92Ki        +2.39%

                 │   old.log   │              new.log               │
                 │  allocs/op  │  allocs/op   vs base               │
WriteTo1KiB-10      108.0 ± 1%    109.5 ± 0%  +1.39% (p=0.000 n=10)
WriteTo1MiB-10      139.0 ± 0%    141.0 ± 0%  +1.44% (p=0.000 n=10)
WriteTo4MiB-10      235.0 ± 0%    237.5 ± 0%  +1.06% (p=0.000 n=10)
WriteTo10MiB-10     429.0 ± 0%    431.0 ± 0%  +0.47% (p=0.000 n=10)
WriteTo64MiB-10    2.161k ± 0%   2.163k ± 0%  +0.09% (p=0.000 n=10)
ReadFrom1KiB-10     39.00 ± 0%    41.00 ± 0%  +5.13% (p=0.000 n=10)
ReadFrom1MiB-10     70.00 ± 0%    72.00 ± 0%  +2.86% (p=0.000 n=10)
ReadFrom4MiB-10     171.0 ± 2%    169.5 ± 3%       ~ (p=0.992 n=10)
ReadFrom10MiB-10    372.0 ± 0%    375.0 ± 1%  +0.81% (p=0.000 n=10)
ReadFrom64MiB-10   2.123k ± 0%   2.123k ± 0%       ~ (p=0.641 n=10)
geomean             255.9         259.1       +1.22%