ronf / asyncssh

AsyncSSH is a Python package which provides an asynchronous client and server implementation of the SSHv2 protocol on top of the Python asyncio framework.
Eclipse Public License 2.0
1.56k stars 156 forks source link

More beginner tips for SFTP server docs #709

Open Andrew-Chen-Wang opened 2 weeks ago

Andrew-Chen-Wang commented 2 weeks ago

I was pretty confused by the parameter names of the SSH server, so I thought I'd leave some pointers for anyone else having trouble

To set up an AsyncSSH SFTP server, you can follow these steps:

Generate SSH Keys:

Generate a host key for the server:

ssh-keygen -t rsa -b 2048 -f ssh_host_key

Generate a user key:

ssh-keygen -t rsa -b 2048 -f id_rsa

Write the Server Code:

Use the following example to set up a simple SFTP server:

import asyncio
import asyncssh
import sys

async def start_server() -> None:
    await asyncssh.listen('', 8022, server_host_keys=['path/to/ssh_host_key'],
                          authorized_client_keys='path/to/id_rsa',
                          sftp_factory=True)

loop = asyncio.new_event_loop()

try:
    loop.run_until_complete(start_server())
except (OSError, asyncssh.Error) as exc:
    sys.exit('Error starting server: ' + str(exc))

loop.run_forever()

I needed to run this in pytest so:

import pytest
import asyncssh
import tempfile
from pathlib import Path

class TestBasedSFTPServer(asyncssh.SFTPServer):
    # https://asyncssh.readthedocs.io/en/latest/#sftp-server
    def __init__(self, chan: asyncssh.SSHServerChannel):
        # Doing the following would create a new directory per user
        root = f"/{tempfile.gettempprefix()}/sftp/" + chan.get_extra_info('username')
        os.makedirs(root, exist_ok=True)
        super().__init__(chan, chroot=root.encode())

@pytest.fixture(autouse=True)
async def server():
    async with asyncssh.listen(
        host="localhost",
        port=8022,
        reuse_port=True,
        password_auth=True,
        public_key_auth=True,
        server_host_keys=[Path(__file__).parent / "ssh_host_key"],
        authorized_client_keys=[Path(__file__).parent / "id_rsa"],
        sftp_factory=TestBasedSFTPServer
    ) as server:
        yield server

you may come across Host key is not trusted for host localhost

Run the following ssh-keyscan -p 8022 localhost >> ~/.ssh/known_hosts.

A workaround is setting your SFTP client to include a parameter known_hosts=None, but that's a security danger.


I'm actually struggling to find a workaround in my unit testing since I don't want to explicitly set ~/.ssh/known_hosts. I have a client that looks like:

    async with asyncssh.connect(
        host=auth.host,
        port=auth.port,
        # https://stackoverflow.com/questions/67222941/can-not-connect-via-asyncssh-error-host-key-is-not-trusted
        # FAQ: in pytest/devs' environments, we can't add the SSH key everywhere. For production,
        # it is expected that the generated key is trusted in ~/.ssh/known_hosts file.
        # TODO Add known_hosts public key
        #  with open('ssh_host_key.pub', 'rb') as f:
        #      host_key = asyncssh.import_public_key(f.read())
        known_hosts=None if "PYTEST_CURRENT_TEST" in os.environ else (),
        options=SSHClientConnectionOptions(
            username=auth.username,
            password=auth.password,
            client_keys=client_keys,
        ),

I tried adding

    with open('ssh_host_key.pub', 'rb') as f:
        host_key = asyncssh.import_public_key(f.read())

connect(
...
known_hosts=host_key
...

parameter to the client, but it failed with TypeError: 'RSAKey' object is not subscriptable or when specifying known_hosts=[host_key], I get TypeError: 'RSAKey' object is not iterable

Any hints as to what I should do?

Menyadar commented 2 weeks ago

known_hosts needs to be list, see description of it in docs https://asyncssh.readthedocs.io/en/stable/api.html#asyncssh.SSHClientConnectionOptions

In the Specifying known hosts section you can read that:

known_hosts [...] can be the name of a file or list of files containing known hosts, a byte string containing data in known hosts format, or an SSHKnownHosts object which was previously imported from a string by calling import_known_hosts() or read from files by calling read_known_hosts().

Personaly i'd go for using SSHKnownHosts object, so this should work:

known_hosts = asyncssh.SSHKnownHosts(host_key)

...

connect(
    ...
    known_hosts=known_hosts,
    ....
)
Andrew-Chen-Wang commented 2 weeks ago

Thanks for the response. I assume host_key here is the public key (.pub) file's content (where the server's private key is loaded in the server_host_key). Still with known_hosts=SSHKnownHosts(public_key) I get Host key is not trusted for host localhost (raised from host_key = client_conn.validate_server_host_key(host_key_data))

ronf commented 2 weeks ago

You need to keep in mind that the known_hosts format is more than just the contents of a public key. It needs to be preceded by a host pattern, and that can optionally be preceded by directive like @cert-authority or @revoked (but that wouldn't be needed here).

At a minimum, if you wanted to try a specific public key for all hosts (which could be ok in something like a unit test), you'd need to prefix the public key data with a '* ' (a star and a space). You can call asyncssh.import_known_hosts(data) to import data once it has this '* ' prefix, and pass that to known_hosts. If you know the hostname you are connecting to, you could also pass that hostname in place of the star.

If you encode the known hosts data in binary mode (as a bytes value), you can also pass that directly into the known_hosts argument without needing to call import_known_hosts on it. You will still need the leading host pattern, though.

ronf commented 2 weeks ago

On another note, you don't need to use ssh-keygen to generate keys. If you like, you can use asyncssh.generate_private_key to do that. There are methods on the generated key to write out either the private or public keys, in a variety of formats.

Andrew-Chen-Wang commented 2 weeks ago

At a minimum, if you wanted to try a specific public key for all hosts (which could be ok in something like a unit test), you'd need to prefix the public key data with a ' ' (a star and a space). You can call asyncssh.import_known_hosts(data) to import data once it has this ' ' prefix, and pass that to known_hosts. If you know the hostname you are connecting to, you could also pass that hostname in place of the star.

Adding * preceding the public ssh key worked, thank you!

On another note, you don't need to use ssh-keygen to generate keys. If you like, you can use asyncssh.generate_private_key to do that. There are methods on the generated key to write out either the private or public keys, in a variety of formats.

This works great! Writing here for future readers

import asyncssh

alg = "ssh-rsa"
user_private_key = asyncssh.generate_private_key(alg)
user_public_key = user_private_key.export_public_key().decode()
server_authorized_keys = asyncssh.SSHAuthorizedKeys()
server_authorized_keys.load(user_public_key)

server_private_key = asyncssh.generate_private_key(alg)
server_public_key = server_private_key.export_public_key().decode()

async with asyncssh.listen(
    ...
    public_key_auth=True,
    server_host_keys=[server_private_key],
    authorized_client_keys=server_authorized_keys,
    ...
)

private_key: str | None
passphrase: str | None
client_keys = []
if private_key:
    client_keys.append(
        asyncssh.import_private_key(private_key, passphrase=passphrase)
    )
async with asyncssh.connect(
    ...
    client_keys=client_keys,
    known_hosts=asyncssh.SSHKnownHosts(f"localhost {server_public_key}"),
    ...
)

Thanks so much for the support; this is such a thorough package 😍

ronf commented 2 weeks ago

Happy to help!

You should be able to do the following regarding known_hosts:

alg = 'ssh-rsa'
server_private_key = asyncssh.generate_private_key(alg)
server_public_key = server_private_key.export_public_key()

async with asyncssh.connect(
    ...
    known_hosts=b'localhost ' + server_public_key,
    ...
)

Your example here actually had decode() called twice, which I think would cause a problem. By not calling it at all and leaving the value as bytes, though, you can actually pass the resulting data directly as a known_hosts argument once you add the leading host pattern. This avoids you needing call into any AsyncSSH internals.

The other option to avoid including internals would be to call asyncssh.import_known_hosts(). In that case, the data can be a str, but the above is simpler.