ParallelSSH / ssh2-python

Python bindings for libssh2 C library.
https://parallel-ssh.org
GNU Lesser General Public License v2.1
228 stars 72 forks source link

ssh2.exceptions.SocketRecvError when calling SFTP.write with extremely large data sizes (>1GB) #99

Closed PhenomPBG closed 3 years ago

PhenomPBG commented 4 years ago

Steps to reproduce:

  1. Example code that produces error.

import socket import ssh2 import ssh2.exceptions from ssh2.session import Session from ssh2.utils import wait_socket from ssh2.error_codes import LIBSSH2_ERROR_EAGAIN from ssh2.sftp import LIBSSH2_FXF_READ, LIBSSH2_SFTP_S_IFDIR from ssh2.sftp import LIBSSH2_FXF_CREAT, LIBSSH2_FXF_WRITE, LIBSSH2_SFTP_S_IRUSR, LIBSSH2_SFTP_S_IRGRP, \ LIBSSH2_SFTP_S_IWUSR, LIBSSH2_SFTP_S_IROTH, LIBSSH2_SFTP_S_IRWXU, LIBSSH2_FXF_TRUNC

host = "localhost" port = 22 username = "test" password = "test123" sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) sock.connect((host, port)) session = Session() session.handshake(sock) session.userauth_password(username, password) sftp = session.sftp_init()

mode = LIBSSH2_SFTP_S_IRUSR | LIBSSH2_SFTP_S_IWUSR | LIBSSH2_SFTP_S_IRGRP | LIBSSH2_SFTP_S_IROTH f_flags = LIBSSH2_FXF_CREAT | LIBSSH2_FXF_WRITE | LIBSSH2_FXF_TRUNC

file_data = bytes(4522932365) # ~4.3GB of data

with sftp.open("test", f_flags, mode) as remote_fh: remote_fh.write(file_data)


2. Stack trace or error messages.

File "/opt/python3/lib/python3.7/site-packages/protocol_kit/sftp.py", line 313, in write with self.__sftp.open(path, f_flags, mode) as remote_fh: File "ssh2/sftp.pyx", line 230, in ssh2.sftp.SFTP.open File "ssh2/utils.pyx", line 157, in ssh2.utils.handle_error_codes ssh2.exceptions.SFTPProtocolError

During handling of the above exception, another exception occurred:

File "/opt/python3/lib/python3.7/site-packages/protocol_kit/sftp.py", line 314, in write remote_fh.write(file_data) File "ssh2/sftp_handle.pyx", line 291, in ssh2.sftp_handle.SFTPHandle.write File "ssh2/utils.pyx", line 179, in ssh2.utils.handle_error_codes ssh2.exceptions.SocketRecvError



__Expected behaviour:__ Write file to SFTP and return the tuple with error_code and number of bytes written.

__Actual behaviour:__ The entire file is written to the remote SFTP with all data, but remote_fh.write() never returns. One CPU core hits 100% usage, and after several minutes the SocketRecvError is raised.
This does not happen with data sets < 4GB.

__Additional info:__
CentOS Linux release 7.6.1810 (Core), 3.10.0-957.10.1.el7.x86_64
Python 3.7.1
ssh2-python            0.17.0
pkittenis commented 4 years ago

Can you upgrade to 0.18.0 and check? Probably worth retesting once 0.19.0 is released also. Not likely to be issue with this library, the bindings are calling the underlying libssh2 C function, and 0.17.0 is an old version.

pkittenis commented 4 years ago

Confirmed as bug.

pkittenis commented 4 years ago

The library code seems to be hanging getting bytes_written back from libssh2 in small chunks (1024k or so) after the file has been fully written.

This results in a hot loop while bytes_written catches up to total_size, mean while file is fully written. As the return code from libssh2 is > 0 and we still do not know the total number of bytes written, the loop never exits. Will aim to reproduce in C code as this may be a bug in libssh2.

PhenomPBG commented 4 years ago

Thanks pkittenis, this is the version of libssh2 that I have installed:

Installed Packages Name : libssh2 Arch : x86_64 Version : 1.4.3 Release : 12.el7_6.2

Will upgrade libssh2 and test again.

PhenomPBG commented 4 years ago

Precisely the same behaviour with libssh2 1.8.0 and ssh2-python 0.18.0.

Will test with libssh2 1.9.0 as well.

hakaishi commented 3 years ago

Is this issue still valid? I just uploaded an archive file > 5 GB without any issues. OS: Ubuntu 20.04.1 libssh2: 1.8.0-2.1build1 Python: 3.8.5 ssh2-python: 0.23.0

method:

mode = LIBSSH2_SFTP_S_IRUSR | \
             LIBSSH2_SFTP_S_IWUSR | \
             LIBSSH2_SFTP_S_IRGRP | \
             LIBSSH2_SFTP_S_IROTH
f_flags = LIBSSH2_FXF_CREAT | LIBSSH2_FXF_WRITE

with open("local_file", "rb") as lofi:
    with connection.open("remote_file", f_flags, mode) as remfi:
        while True:
            data = lofi.read(1024 * 1024)
            if not data:
                break
            else:
                _, sz = remfi.write(data)

[...]
pkittenis commented 3 years ago

Not sure, have seen something similar with libssh2 1.9 but not tested extensively.

PhenomPBG commented 3 years ago

In my testing this happens with really large single writes (> 4GB) to the remote file, smaller writes do not seem to behave in this way.

hakaishi commented 3 years ago

Ah! Writing one chunk of data with the size of 4GB and more in one go? My test case was a single 5GB file read and written in smaller portions of data.

Might be some limitation of SSH2.

But hey, who would keep a file that big in memory? That's a design problem from my point of view...

hakaishi commented 3 years ago

Any way, I'll also try and take a look at the code.

PhenomPBG commented 3 years ago

For file transfers it would be rather silly, but not all cases are that simple :)

Currently working around it by wrapping the internal buffer and treating it as a file. The behaviour is just peculiar.

pkittenis commented 3 years ago

From what I saw libssh2 buffers large writes internally in chunks and returns many smaller bytes written for an already completed large write.

For a 4GB write, the loop getting small bytes written up to 4GB is very long. The library should ideally handle the bytes written libssh2 code in C so it does not result in a hot loop in python code.

pkittenis commented 3 years ago

Library is already handling all bytes written by libssh2 code in C - without the GIL no less.

libssh2 seems to write in a max of 30MB chunks (MAX_SFTP_OUTGOING_SIZE in libssh2).

Given a 4GB buffer, it takes some time to call libssh2_sftp_write 1500 times and for write to finally return. The file is not guaranteed written until all writes have been ack-ed by server, even if the file size is correct.

See libssh2 sftp write documentation.

Not much more the library can do until this chunking behaviour can be disabled in libssh2.

Buffer your writes.