pkg / sftp

SFTP support for the go.crypto/ssh package
BSD 2-Clause "Simplified" License
1.5k stars 379 forks source link

File upload hangs (2) #519

Open andrewbaxter opened 2 years ago

andrewbaxter commented 2 years ago

I'm not sure if this is the same as the other "file upload hangs" issue.

On 1.13.5, randomly large uploads just stop uploading. This is with concurrent writes off, a 23mb file. I tried uploading the same file multiple times and it stopped at various places (12mb, 192kb), 5m later no further bytes had been added to the file on the target host. Out of two times I tried it in delve, one succeeded one failed (got stuck) at 12615680 bytes.

I tried with concurrent writes on and there was no issue (tried once).

This is the first time I've seen the issue. We do smaller file uploads fairly regularly (config files/small text files, probably <1kb). It's possible we've never done large uploads.

The code is almost identical to https://github.com/pkg/sftp/issues/502 but we have the whole source file in memory (using bytes.NewReader(data)).

I found the goroutine stuck at https://github.com/pkg/sftp/blob/v1.13.5/conn.go#L143 (admittedly it doesn't help much due to the async nature of the sftp code).

The destination server is also using this library (possibly a slightly older version).

andrewbaxter commented 2 years ago

Sorry, just encountered it with concurrent writes too (21921792 bytes in)

puellanivis commented 2 years ago

Weird. The linke of code you’re linking to means that it sent out a request, but has never received a response from that request. 🤔 It’s possible that the other is misconfigured, and/or not issues responses? Hard to say, and short of setting up a Man-In-The-Middle attack, it’s going to get enough data together to figure out what’s causing the issue.

But a dropped network connection without any redial/correction.

andrewbaxter commented 2 years ago

I think you're probably right. I'll try doing low level logging on both sides, this may be a general ssh issue that happens to show up in sftp due to the volume of data/duration of file handle usage.

I suppose it's not possible to timeout and retry either due to TCP multiplexing, if this is indeed a missing response packet?

puellanivis commented 2 years ago

Retrying would be quite difficult, since we’ve already lost the connection, but not aware of it. You should be able to yourself check the length of the file, and then Seek() to that point and continue from there.

andrewbaxter commented 1 year ago

Sorry, I haven't had time to look into this further and I haven't been doing work in that area lately. Feel free to close this, I do intend to investigate though at some point if it comes back into my radar.