MagicStack / asyncpg

A fast PostgreSQL Database Client Library for Python/asyncio.
Apache License 2.0
6.88k stars 399 forks source link

feat: allow connection with pre-configured socket #1054

Open jackwotherspoon opened 1 year ago

jackwotherspoon commented 1 year ago

I'd like to add support for creating a postgres connection over an existing socket like object. In this approach startTLS can or cannot be used depending on the caller. This lines up similarly with what we have in Java (socketFactory connection param, driver code), Go (pgx DialFunc) (these two are slightly different as they allow specifying a creator/generator func that creates the socket, while here we could just pass in the socket directly but the extent of the feat is the same), and other Python libraries (pymysql, and pg8000)

The equivalent pymysql PR that introduced this change explains the feature really well https://github.com/PyMySQL/PyMySQL/pull/355

The benefit of this feature is that it allows the user to specify their own secure tunnel to connect over (such as ssh).

Is this sort of feat possible for asyncpg?

elprans commented 1 year ago

Sure. A socket factory callback to connect() would likely be the cleanest and most straightforward approach, though there are likely asyncio-imposed requirements on what the returned socket can be (at the very least it should support non-blocking I/O and be compatible with epoll).

jackwotherspoon commented 1 year ago

@elprans thanks for the quick response!

Any ideas on where I should look to get started on this? Any tips or further suggestions would be greatly appreciated 😄

elprans commented 1 year ago

Sure, you need to pass the callback all the way to __connect_addr and then pass the result of calling it to loop.create_connection() as the sock argument.

jackwotherspoon commented 9 months ago

Hi @elprans just wondering if I could pick your brain as I've started attempting to implement this feature.

So in __connect_addr if I understand correctly it would look like this:

elif params.socket_callback:
    # if socket factory callback is given, create socket and use
    # for connection
    sock = await params.socket_callback()
    connector = loop.create_connection(proto_factory, sock=sock)

I'm trying to see if that would work for the following callback...

def sock_func(host: str) -> socket.socket:
    return socket.create_connection((host, SERVER_PROXY_PORT))

async def main():
    host = "X.X.X.X"
    async def async_sock_func():
        return await asyncio.to_thread(sock_func, host)

    return await asyncpg.connect(
        user=user,
        database=db,
        password=passwd,
        socket_callback=async_sock_func,
        **kwargs,
    )

Let me know your thoughts, looking forward to hearing them 😄

jackwotherspoon commented 9 months ago

@elprans I linked a WIP PR for our use-case that seems to be working with my PR branch build 😄

enocom commented 9 months ago

How about exposing a connector factory, such that callers would have more control over how the socket was created?

For example, in __connect_addr, we'd add:

elif params.connector_factory:
    connector = params.connector_factory(proto_factory, *addr, loop=loop, ssl=params.ssl)

This would allows full customization of the socket and how it was created (e.g., creating an SSH tunnel, doing reads and writes to the socket prior to the Postgres protocol when a proxy sits in front of the database, etc).