nikita-volkov / hasql-pool

A pool of connections for Hasql
http://hackage.haskell.org/package/hasql-pool
MIT License
17 stars 15 forks source link

add max connection lifetime option #28

Closed robx closed 1 year ago

robx commented 1 year ago

We're seeing trouble with PostgREST where long-lived connections grow to use large amounts of memory postgresql-server-side: https://github.com/PostgREST/postgrest/issues/2638. (As far as I understand, postgresql caches a variety of things per connection-handler and never releases them.)

It seems to me that putting some arbitrary maximal lifetime on connections is the best way to address this, and is something that's likely to be relevant to other users of hasql-pool.

Somewhat outdated initial description follows, see below for the current state:

This PR is a quick stab at implementing that, but I'm more than happy to rewrite it or to go with some other approach. The current draft definitely has some issues, particularly:

nikita-volkov commented 1 year ago

Haven't looked at the code yet. I already love the idea in itself. You have my full support in having this feature implemented and merged.

Would it be better to add both settings while we're at it?

I do also like the idea of adding the maxIdleTime configuration, but I believe it's best to do things iteratively and in isolation. Let's add the maxLifetime first, then, if energy is left, focus on the next one. I've isolated it into its own issue (#29).

force both to be specified?

Great! I still like that idea. Seems like a responsible design move.

introduce some settings datatype with reasonable default values for both?

I support that as well.

nikita-volkov commented 1 year ago

I took a look at the code. I have a few comments.

  1. If I understand correctly the connection whose lifetime is past due gets reestablished. I think it would be better to first try to fetch one of the available connections and only if there's none establish the new one.
  2. The invalidation is implemented via a check before an attempt to use. A problem with that is that the resources will be held up until an attempt to use the connection is made, which is unpredictable and may happen too late or even never. Looks like this needs addressing by running a separate thread that will be responsible for the cleaning.

What do you think?

robx commented 1 year ago

I took a look at the code. I have a few comments.

1. If I understand correctly the connection whose lifetime is past due gets reestablished. I think it would be better to first try to fetch one of the available connections and only if there's none establish the new one.

2. The invalidation is implemented via a check before an attempt to use. A problem with that is that the resources will be held up until an attempt to use the connection is made, which is unpredictable and may happen too late or even never. Looks like this needs addressing by running a separate thread that will be responsible for the cleaning.

What do you think?

Yes, this was pretty much the minimal attempt, trying to avoid adding more complexity to the implementation. Below is me trying to argue that maybe we could attempt to roll with this, but typing it up I kind of got to the point where I think it might be easier to just give an active management thread a shot.

some rambling on the quoted issues Adressing only 1. in the straightforward way (loop until you find a "live" connection or the pool is empty) introduces some pathological cases, e.g. imagine that the pool is full of idle connections, and they all time out. Then the next request is going to loop through and close all of the connections before opening a new one. That potential huge overhead seems worse to me than occasional connection-establishment overhead. (The current design of the library implies this occasional connection-establishment overhead anyway, since we're not pre-establishing connections.) Regarding 2., I agree that some active component makes sense for a robust solution to timing out. It also seems to imply pretty invasive changes to the library however, which I was a bit hesitant to suggest/implement. I was hoping the simple approach might be good enough (for a well-loaded pool it should address the base issue since connections will be reused regularly), but it's true that it doesn't adress all scenarios. E.g. the pool won't ever shrink after bursts. Could we get away with a partial solution that keeps the library passive by documenting the behaviour?
robx commented 1 year ago

I've reworked this now to include active management. There's a couple of ad hoc decisions here regarding the interface that are not at all intended to be final. Summary of changes at this point:

I think the approach should allow moving forward with this, and would open the door to adding other settings such as an idle timeout or pre-creating connections (#10).

robx commented 1 year ago

I've polished this up a bit now, leaning into some of those ad hoc decisions I made. Clearly they're all up for debate, specifically the configuration interface and how we approach backward compatibility with repect to acquire vs withPool.

Also all the naming, choice of defaults, allowing disabling the timeouts etc etc.

nikita-volkov commented 1 year ago

Thanks. Taking a look

nikita-volkov commented 1 year ago

I don't think the transfer to the withPool construct is necessary.

I see the following strategies which can let us achieve the same features without the ovehaul of UX:

  1. Have a management thread per each connection. This will allow us to stop these threads when associated connections get released.
  2. Have a single thread, stop it when the pool is empty and restart when there's at least one connection.
  3. Have a single management thread and have it stopped when all references to the pool are garbage collected. I think a similar approach is applied in the "resource-pool" library.

Possibly there are other options as well.

I hate to see so much work done on your part. I should have initiated the design discussion beforehand. Sorry for that. To avoid repeating the same mistake let's discuss the suggested design before working on the code. What do you think about the suggestions? Do you have other options in mind?

robx commented 1 year ago

I think I'd tend most towards leaving it up to garbage collection, even if that's a bit implicit for my taste. 1 and 2 (2 to a lesser extent) seem to come with more bookkeeping, and seem a bit at odds with #10.

I don't think it's a big change, I'll try it out to see how it goes.

(What do you think about the Config approach to collecting the increasing number of settings?)

robx commented 1 year ago

@nikita-volkov could you have a look at this one again when you have a chance?

nikita-volkov commented 1 year ago

Hey Rob! Sorry for having to be reminded. Been under a bit of load from life-stuff. I'll get back to you in the coming days.

robx commented 1 year ago

Hey Rob!

I've finally gotten to carefully review the PR. Sorry for the delay.

Looks good generally. I think we can move forward from here. Great test suite! I do have a few things to discuss, but I promise to be responsive until we merge.

* Let's avoid exposing management interval configuration. It seems like a negligible detail not worth the user's attention and is actually only required by the current implementation. It is possible to reimplement with the management thread wake-ups automatically derived from the timeouts most optimally each time the thread goes to sleep. Avoiding exposing that will let us implement the optimization in the future (if needed) without changing the API.

* The Config type seems like an attempt to provide a neater approach to configuration of this package, however there are downsides. It is a choice of a particular approach with its downsides and there are others. To avoid bloat and reduce maintenance I would rather keep the package minimal and stay neutral on the matter. It is a higher-level problem and can be implemented as an extension lib, if needed.

If you want I can take over from here and layer the according changes on top of yours and merge.

Hi Nikita, thanks for the review! I'd be happy with you taking it over, no particular strong feelings on any of the remarks. The config change was mostly because I felt quite awkward about increasing the number of positional plain integer parameters further, no objection if you're happy with that though (or with any other approach :)).

nikita-volkov commented 1 year ago

I've applied the changes and merged. Thanks for your hard work! I'll let the dust settle for a couple of days before releasing. Please do raise issues if you disagree with any of the changes.

nikita-volkov commented 1 year ago

Released!