ParallelSSH / ssh2-python

Python bindings for libssh2 C library.
https://parallel-ssh.org
GNU Lesser General Public License v2.1
228 stars 72 forks source link

Make scp_recv example #100

Closed hakaishi closed 3 years ago

hakaishi commented 4 years ago

About my environment

I'm using Ubuntu 20.04 with python3.8.

The problem

When I try to receive data with scp_recv or scp_recv2 from the sftp.session module some bytes are added to all files.

My code

I am not sure if I use the correct way to retrieve the data... What I am trying to do is the following:

sock = socket(AF_INET, SOCK_STREAM)
sock.connect(("host", port))
sock.settimeout(3)
cli = Session()
cli.handshake(sock)
cli.userauth_password("user", "password")
sftp = cli.sftp_init()

# some other code here ...

res = sftp.session.scp_recv2(some_file)
with open(join(folder, some_file), "wb+") as f:
    size = 0
    while True:
        s, buf = res[0].read()

        print(s, buf)

        if s < 0:
            res[0].close()
            break
        size += s
        f.write(buf)

        print(size, res[1].st_size)

        if size >= res[1].st_size:
            res[0].close()
            break

The result

The first print shows that the last read contains the following additional byte b'\x00'. Like this 133 b'[gd_resource type="SpatialMaterial" format=2]\n\n[resource]\n\nalbedo_color = Color( 0.847059, 0.113725, 0.113725, 1 )\nmetallic = 0.7\n\n\x00\x00' The second print shows for example 133 132. The received file is exactly one byte bigger than the original. This makes all downloaded (non-text) files invalid for me.

How would you do it correctly? - Or is this perhaps a bug?

pkittenis commented 4 years ago

Client code needs to check st_size from file info returned by scp_recv. It is not a bug with this library.

hakaishi commented 4 years ago

I don't really understand why there is additional data, but that would be another issue, maybe. So, in other words: I would need to do change it like this:

    size = 0
    while True:
        s, buf = res[0].read()

        if s < 0:
            print("error code:", s)
            res[0].close()
            break
        size += s

        if size > res[1].st_size:
            f.write(buf[ : (res[1].st_size - size)])
        else:
            f.write(buf)

        if size >= res[1].st_size:
            res[0].close()
            break

I don't know why, but this looks quite bulky for a simple copy operation... For comparison:

with open("fileA", "rb") as fi:
    with open("fileB", "wb+") as fo:
        fo.write(fi.read())

By the way: Without closing the channel, this loop will hang forever...

pkittenis commented 4 years ago

From libssh2 C code example:

    while(got < fileinfo.st_size) {
        char mem[1024];
        int amount = sizeof(mem);

        if((fileinfo.st_size -got) < amount) {
            amount = (int)(fileinfo.st_size -got);
        }

        rc = libssh2_channel_read(channel, mem, amount);

        if(rc > 0) {
            write(1, mem, rc);
        }
        else if(rc < 0) {
            fprintf(stderr, "libssh2_channel_read() failed: %d\n", rc);

            break;
        }
        got += rc;
    }

This library provides bindings for the libssh2 C API. It does not change the API in any way. There are higher level clients like the one in parallel-ssh that provide easy to use API based on this library.

client.scp_recv(<source file>, <dest file>)

This ticket is only for providing an example for SCP. Can't help with fixing code snippets.

hakaishi commented 4 years ago

Well, I do understand the circumstances. In the end it's just the way the library works.

I don't want to use anything that depends on paramiko since they will have to make serious changes in order to accept encoding different from UTF-8. I made a pull request with most of the necessary changes. But it still depends on what the authors intend to do. I don't see it merged in the near future, so I will have to use something else. Like ssh2-python. :)

pkittenis commented 4 years ago

Completely understand, my motivations for writing bindings for libssh2 and now libssh are from poor experience with existing python ssh libraries. parallel-ssh is scheduled to drop paramiko support entirely in 2.0.0 - it's no longer the default client and only pulled in for backwards compatibility.

For paramiko specifically there are far too many bugs and bad design choices for it to be used in production code, from experience.

Would point to the SSHClient in parallel-ssh for an easy to use client based on this library. If you'd rather not have paramiko pulled in at all which is understandable, hang on until the 2.0.0 release.

hakaishi commented 4 years ago

Wow! That's great news! I wouldn't mind for paramiko to be an dependency only. However, I'm very curious about version 2.0.0. I won't ask for a date, but I'm curious if you think to be able to make that release this year. :D

pkittenis commented 4 years ago

It is dependency only at the moment. Yes, should be.

pkittenis commented 3 years ago

Plenty of examples for reading/writing with this library.

For easy to use API see scp_recv in parallel-ssh.