pkg / sftp

SFTP support for the go.crypto/ssh package
BSD 2-Clause "Simplified" License
1.5k stars 380 forks source link

Slow downloads compared to FileZilla #15

Closed vanbroup closed 8 years ago

vanbroup commented 10 years ago

I noticed some serious difference when comparing the download speed via this package with the same download via FileZilla.

When downloading with Filezilla I can download at 30Mpbs+ with 10 concurrent files (over a VPN), but when downloading with the GO program directly on a high speed internet connection I can't get above the 20Mpbs with 16 concurrent downloads (or about 10 with 10 concurrent).

The readme already lists that currently all traffic with the server is serialized and I can imagine that this is influencing the download performance. But maybe also other factors as packet size and or compression can play a role here.

The code that is facilitating the download:

log.Printf("[%s] Starting to download", f)

// check if file already exists
if _, err := os.Stat(file); !os.IsNotExist(err) {
    log.Printf("[%s] Already downloaded", f)
    return
}

// open local file to save download
out, err := os.Create(file)
if err != nil {
    log.Printf("[%s] Error (%s)", f, err.Error())
    return
}
defer out.Close()

// open file to download
dl, err := s.Open(f)
if err != nil {
    log.Fatal(err)
}
defer dl.Close()

// copy the bytes
nb, err := io.Copy(out, dl)
if err != nil {
    log.Println("error downloading file", err)

    // delete
    err = os.Remove(file)
    if err != nil {
        log.Println("error deleting file after download error", err)
    }
    return
}

log.Printf("[%s] Download completed (%d bytes)", f, nb)
davecheney commented 9 years ago

https://code.google.com/p/go/source/detail?repo=crypto&r=0c9c1f312aa93419c7516eeecff7d7f36f7fd445 landed in the go.crypto repo recently. I don't expect it to cure all ills, but I'd be interested in hearing if it has any impact on your benchmarks.

On Fri, Sep 12, 2014 at 5:11 PM, Paul van Brouwershaven < notifications@github.com> wrote:

I noticed some serious difference when comparing the download speed via this package with the same download via FileZilla.

When downloading with Filezilla I can download at 30Mpbs+ with 10 concurrent files (over a VPN), but when downloading with the GO program directly on a high speed internet connection I can't get above the 20Mpbs with 16 concurrent downloads (or about 10 with 10 concurrent).

The readme already lists that currently all traffic with the server is serialized and I can imagine that this is influencing the download performance. But maybe also other factors as packet size and or compression can play a role here.

The code that is facilitating the download:

log.Printf("[%s] Starting to download", f)

// check if file already exists if _, err := os.Stat(file); !os.IsNotExist(err) { log.Printf("[%s] Already downloaded", f) return }

// open local file to save download out, err := os.Create(file) if err != nil { log.Printf("[%s] Error (%s)", f, err.Error()) return } defer out.Close()

// open file to download dl, err := s.Open(f) if err != nil { log.Fatal(err) } defer dl.Close()

// copy the bytes nb, err := io.Copy(out, dl) if err != nil { log.Println("error downloading file", err)

// delete
err = os.Remove(file)
if err != nil {
    log.Println("error deleting file after download error", err)
}
return

}

log.Printf("[%s] Download completed (%d bytes)", f, nb)

— Reply to this email directly or view it on GitHub https://github.com/pkg/sftp/issues/15.

vanbroup commented 9 years ago

Updated the crypto ssh library, unfortunately this doesn't make a huge difference. I can see about 4 Mbps improvement for longer running jobs.

I have attached the pprof graphs of the program running before and after the crypto/ssh upgrade.

Before crypto/ssh upgrade:

sftp-before-update

Total: 1358 samples
     245  18.0%  18.0%      300  22.1% type..eq.syscall.SockaddrLinklayer
     133   9.8%  27.8%      133   9.8% crypto/sha1.block
     108   8.0%  35.8%      108   8.0% crypto/aes.NewCipher
      80   5.9%  41.7%       80   5.9% time.now
      77   5.7%  47.3%       77   5.7% runtime.memmove
      55   4.1%  51.4%      203  14.9% crypto/aes.decryptBlockAsm
      46   3.4%  54.8%       46   3.4% runtime.mmap
      42   3.1%  57.9%       42   3.1% crypto/sha1.(*digest).checkSum
      39   2.9%  60.8%      147  10.8% crypto/aes.expandKeyGo
      32   2.4%  63.1%       32   2.4% runtime.memclr
      25   1.8%  64.9%       28   2.1% runtime.MSpan_Sweep
      25   1.8%  66.8%       25   1.8% runtime.duffcopy
      25   1.8%  68.6%       25   1.8% scanblock
      24   1.8%  70.4%       24   1.8% crypto/cipher.(*gcm).counterCrypt
      14   1.0%  71.4%       14   1.0% type..eq.crypto/sha1.digest
      13   1.0%  72.4%       13   1.0% crypto/sha1.(*digest).Sum
      13   1.0%  73.3%       13   1.0% runtime.symtabinit
      12   0.9%  74.2%       12   0.9% ExternalCode
      12   0.9%  75.1%       12   0.9% math/big.(*nat).expNNWindowed
      12   0.9%  76.0%       12   0.9% scanbitvector

After crypto/ssh upgrade:

sftp-after-update

Total: 1281 samples
     244  19.0%  19.0%      297  23.2% syscall.Syscall
     181  14.1%  33.2%      181  14.1% crypto/sha1.block
      92   7.2%  40.4%       92   7.2% crypto/aes.encryptBlockAsm
      77   6.0%  46.4%       77   6.0% runtime.futex
      71   5.5%  51.9%      201  15.7% crypto/cipher.(*ctr).refill
      67   5.2%  57.1%       67   5.2% runtime.usleep
      43   3.4%  60.5%       43   3.4% runtime.epollwait
      35   2.7%  63.2%       35   2.7% runtime.memmove
      25   2.0%  65.2%       29   2.3% runtime.MSpan_Sweep
      22   1.7%  66.9%      129  10.1% crypto/aes.(*aesCipher).Encrypt
      21   1.6%  68.5%       21   1.6% scanblock
      17   1.3%  69.9%       18   1.4% crypto/elliptic.p256ReduceDegree
      16   1.2%  71.1%       16   1.2% runtime.memclr
      15   1.2%  72.3%      107   8.4% crypto/aes.encryptBlock
      15   1.2%  73.5%       15   1.2% crypto/cipher.fastXORBytes
      15   1.2%  74.6%       15   1.2% math/big.divWW
      15   1.2%  75.8%       15   1.2% runtime.xchg
      11   0.9%  76.7%       11   0.9% ExternalCode
       9   0.7%  77.4%        9   0.7% flushptrbuf
       9   0.7%  78.1%        9   0.7% math/big.addMulVVW
davecheney commented 9 years ago

Hi Paul, I wouldn't trust that before profile, it looks like it wasn't matched with the binary that produced it.

vanbroup commented 9 years ago

Sorry, I may have created the previous graph with profile before and the binary after the update. Maybe this graph from a small sample from before the crypto/ssh upgrade is useful then. I'm not sure if it's possible to simply rollback the upgrade so I can create the before graph again?

before-update

davecheney commented 9 years ago

Paul, #18 has been merged, please tell me if this has improved the situation at all.

fd0 commented 9 years ago

I think due to the serialisation of packets the maximum throughput is bound by the latency to the server. Removing this limitation I was able to speed up the throughput for writing a single file by a factor of 3.5 from about 9MiB/s to 35MiB/s for a 500MiB test file. I'll probably cleanup my implementation and create a pull request the next days.

vanbroup commented 9 years ago

Dave, #18 doesn't make a lot (if any) difference.

Alexander, sounds good and that would definitely make sense.

fd0 commented 9 years ago

Hey Paul, have a look at my branch increase-throughput and #24 and please report if this increases the throughput for you. :)

Dave, this is a rough first version, I also added benchmarks for Read() and Write(). These seem to look worse than before, but the real world scenario I tested (with more than zero latency) speaks for itself. Please have a look.

davecheney commented 8 years ago

I believe this was addressed in #41