pkg / sftp

SFTP support for the go.crypto/ssh package
BSD 2-Clause "Simplified" License
1.52k stars 380 forks source link

Stop ReadFromWithConcurrency sending more data than it needs to #537

Closed ncw closed 1 year ago

ncw commented 1 year ago

It was discovered that the ReadFrom method for uploading files in pkg/sftp was sending more data than it needed to.

This was tracked down to the ReadFromWithConcurrency method forgetting to truncate the packets it was sending to the size Read.

This was giving the remote server more work to do as it was writing and re-writing parts of a file.

See: https://github.com/rclone/rclone/issues/6763

drakkan commented 1 year ago

Thanks!

puellanivis commented 1 year ago

Huh… good catch!

ncw commented 1 year ago

Thanks for merging :-)

On a philosophical point - the current implementation directly couples the block size read from the io.Reader to the block size sent over the SFTP packet.

io.Readers are quite entitled not to fill the buffer if they don't have data at that moment (common with network based io.Readers).

From https://pkg.go.dev/io#Reader :

If some data is available but not len(p) bytes, Read conventionally returns what is available instead of waiting for more.

Since SFTP isn't latency sensitive, I'd suggest reading from the io.Reader until the outgoing packet is full. This will decrease the number of packets sent and increase the throughput slightly.

What do you think? Happy to send a PR if you like the idea.

puellanivis commented 1 year ago

Yeah, I was thinking of this as an alternative to the code here, basically, use io.ReadFull instead. I figure the change wouldn’t be so big at all, so, if you want to spin up a PR, and at least get some benchmark information about if it’s better or not…