utterances-bot commented 3 years ago

Designing a connection pool for psycopg3 — Psycopg

Python adapter for PostgreSQL

https://www.psycopg.org/articles/2021/01/17/pool-design/

dvarrazzo commented 3 years ago

Some feedback received consider killing the program a default behaviour too aggressive and not the pool's role, which I appreciate.

In this case probably, instead of having a terminate() hook, there could be a connection_failed() hook, with a default empty implementation, which people thinking that dying is a good way to solve the problem can subclass with exit(1) or whatever seems fit.

geophile commented 3 years ago

I have used connection pools in other languages, and psycopg2 for Python, but I've never had any need for connection pools in Python. That said, I'm happy to see this work, and here are my thoughts.

Connection configuration: Subclassing feels wrong here. In my own applications, there is always a layer that I write between the application and the driver (or pool), because I often find myself wanting to add behavior in different ways. Registering connections, pre/post-processing around calls, etc. So I would do MyAdapter.register(conn) (for example) in my own code. Starting down the road of supporting sublcassing, or hooks via decorators, seems like a large extension to the interface that needs to be documented, and its always going to be useless for some people, and the wrong abstraction for others.

Exclusive connections: I agree with your conclusion, not needed.

Connection failure: The connection pool cannot insulate an application from connection problems. Even aside from catastrophic database failures, a connection can throw an exception for a variety of reasons, and the application has to be written to deal with it. A temporary failure will typically cause the current transaction to fail, and the application writer should know how to deal with this (replay the logic since the transaction began, ask user to try again, etc.)

The application must also deal with the case that a connection cannot be obtained, e.g. the database is down. Again, I don't see what the pool can do that would be valuable in masking a failure for any amount of time.

API:

As mentioned above, I would omit configure(conn). If it's there, you will probably get requests for more lifecycle hooks. (Actually, you will probably get those requests in any case. I think the docs should explain that hooks are easy enough for the app to do, and to respond to these requests with a link to that paragraph in the docs.)
get_maintenance_task(): I don't see a lot of value here. The app hands a list to the pool and then gets back one of them. The pool is just maintaining the list and an index into it.

rustprooflabs commented 3 years ago

Another possible class of failure under connection failures is when some connections are established but others are unable to, e.g. due to exceeding max_connections in Postgres. Assuming there are at least minconn but less than maxconn I would expect behavior to use the timeout_sec behavior for the frontend.

On the backend, it might be nice to have a way to tune how aggressive it is at trying to get to maxconn after Pg reports being at its max connections. I'm imagining two apps with connection pools fighting each other, thinking in these situations one app could be set as high priority to fill its pool, with the other having a lower priority.

dvarrazzo commented 3 years ago

@geophile

Connection configuration: Subclassing feels wrong here. In my own applications, there is always a layer that I write between the application and the driver (or pool), because I often find myself wanting to add behavior in different ways. Registering connections, pre/post-processing around calls, etc. So I would do MyAdapter.register(conn) (for example) in my own code. Starting down the road of supporting sublcassing, or hooks via decorators, seems like a large extension to the interface that needs to be documented, and its always going to be useless for some people, and the wrong abstraction for others.

I am aware: I try to avoid API-by-subclassing if possible. However connection configuration is something you would do at connection creation, not on getconn(), because you would to it only once per connection lifetime. So in order to provide a configuration hook I don't see other ways except subclassing a method or registering a callback.

DzigaV commented 3 years ago

@geophile

Well, it can't insulate it completely. But it in the modern world of RDS/Aurora-style systems that throw connection curveballs when failing over or scaling, this is by far the most important aspect of a Pool, imho.

That and allowing me to mash the database with multiple async queries when I feel the need.

asmith-cmd commented 3 years ago

The pool manager could exploit usage patterns if the application provided a label when it requests a connection. On returning a connection, the pool manager could indicate if the connection already had the requested label or if it was a fresh connection. Labelling could also be used as a basis for management of connection configurations.

bikeshedder commented 3 years ago

I'm the author of deadpool, a dead simple async pool for Rust. How about translating that dead simple approach to the psycopg3 pool?

Connection losses are rare and why do you even bother detecting them early? It is quite possible that a connection loss occurs while a connection is being pooled so you gain basically nothing by having a more complex system in place. I would recommend you just put the connection back in the pool regardless of its current state. Upon next retrieval you can check the health of the acquired object and if it is unusable simply throw it away and fetch the next object from the pool or create a new connection if the pool is currently empty. The actual logic in deadpool is barely 100 LoC (Pool::get and Object::drop) and therefore very easy to understand and reason about.

When recycling a connection you have multiple ways to check the health and make sure the connection is usable. See https://docs.rs/deadpool-postgres/0.7.0/deadpool_postgres/config/enum.RecyclingMethod.html

It's a simple, elegant and performant solution to a problem that's often solved in convoluted and over engineered ways.

dvarrazzo commented 3 years ago

Hi @bikeshedder thank you for your input: I will definitely check out your implementation for ideas.

Checking that a connection is broken on return is cheap, because it's an information already on the client side. Checking for usability on request is expensive because it requires at minimum a ping to the server, which slows down the client asking for it. Both are simple to implement, but for the way this pool is designed it prefers to spend more time when returning a connection then when giving it, because returning a connection can do some work in a separate thread.

From your description of what deadpool does, I understand that it differs from what implemented in psycopg3 pool. In this pool, the need to create a new connection never blocks the client requesting it: if connection creation is slow, it can easily happen that a client would return an existing connection before the new one is created (see HikariCP analysis). So the strategy implemented here is to prepare a new connection in a parallel thread and add it to the pool when ready, regardless of whether the requesting client is still waiting for it or it has been already served.

A more complex strategy to implement would be to have a background worker to periodically check the state of the connections in the pool. Thinking about it, I don't think it's really worth to have it, especially now that the pool usage is round-robin so connections are used uniformly. Unless the check is ridiculously aggressive, the chances that it detects a broken connection before a client does are slim, even in a moderately used pool. The client has to cope anyway with the possibility that a connection breaks in any moment during its life, a connection might break right after it's been served, so doing work to check for its validity before giving it out seems something that would slow down the requester for not much gain in robustness. What I'm thinking to do is rather to trigger a check for integrity of all the connections inside the pool whenever a connection is returned broken, because that's something that can be done in parallel.

Gerardwx commented 3 years ago

I agree with previous commenters to lose the subclass. Having two ways to do the same thing in an API slows me down -- why are there two ways? What's the tradeoff between the two methods, are they somehow different.

Would it be possible to have values like minconn et. al. be properties? So that as an application goes through different stages of the lifecycle they can be adjusted. I wouldn't do that if it would make the code too complicated or less robust.

dvarrazzo commented 3 years ago

@Gerardwx probably we will only document the callback way of doing things: it is probably good enough to define clearly what are the extension points defined by the system. Probably they will be:

customise a connection after creation
customise the cleanup of the connection before returning it to the pool
customise what to do on connection failed "for good" (i.e. after retrying for the allocated time)

As of scaling the pool size, it is already implemented: there is a pool.resize(minconn, maxconn) which can be called to change the pool size. Not properties to make it easier to maintain the maxconn >= minconn invariance :)

dave-shawley commented 3 years ago

Should there be a way to create an exclusive connection?

Yes please! We have a large number of small databases that are user-specific and a small number of large databases that we use in a pooled fashion. Requiring that all connections are pooled means really ugly workarounds in user code. My applications generally have two or three pools that are connected to shared databases but make one-off connections to customer-specific databases occasionally. The "one-off" connections are not pooled.

If psycopg3 includes a sane API for creating a one-off connection in a context manager, then this is not a problem.

FWIW, I ended up implementing a context manager that create a pool with a min/max size of 1 using the user-specific DSN, fetching the only connection and yielding it. When the context manager exited, I disposed of the pool. Luckily the pool had very little logic behind it so creating and destroying a pool instance inside of context manager wasn't a performance headache. In this case, the new pool instance would spin up management tasks for a single connection before immediately closing them all. I would really rather avoid that.

dvarrazzo commented 3 years ago

@dave-shawley what is a one-off one-connection pool useful for?

dave-shawley commented 3 years ago

@dvarrazzo I think that being able to create a connection using async with is what I am after. I didn't see the following in the documentation

Connections behave as context managers: on block exit, the current transaction will be committed (or rolled back, in case of exception) and the connection will be closed.

I was initially concerned that the pool abstraction was going to be the only way to get a connection. Sorry for the noise.

psycopg / psycopg-website

articles/2021/01/17/pool-design/ #4

Designing a connection pool for psycopg3 — Psycopg