paurkedal / ocaml-caqti

Cooperative-threaded access to relational data
https://paurkedal.github.io/ocaml-caqti/index.html
GNU Lesser General Public License v3.0
299 stars 36 forks source link

Postgres DB Pool should drain connections to prevent DB memory consumption #94

Closed pm-mck closed 2 years ago

pm-mck commented 2 years ago

Using caqti, and specfically Caqti_lwt.Pool on an api processing about 100k transactions an hour, over time the DB metrics for database free memory (Postgres RDS) will get critically low, and requires us to restart our api. After some investigation, it appears that Postgres keeps around a metadata cache for each connection, as long as that connection is alive. Because the pool hangs onto connections indefinitely, Postgres will continually grow this cache until it dies or the connections are terminated. We added a timer with a Pool.drain and this alleviated the memory pressure.

I'm posting this issue here because I think it's worth documenting this scenario, and/or providing a more robust connection pool behavior to handle collecting idle connections. Or, if there is a better way to do this, I'm ready to adopt it!

Happy to answer any queries or run any tests that I can.

paurkedal commented 2 years ago

This is an important observation. I can see the Caqti driver already calls reset, so it looks like we need to free connections regularly. I think the most direct solution will be to add a pool parameter limiting how many times a connection is re-used. This shouldn't make a big different on performance, and will even out the extra load, compared to occasional draining. I'll return with a proposal.

(I have considered some timing-based pruning of connections, as well, to avoid redundant TCP connections when activity is low.)

paurkedal commented 2 years ago

I just pushed the change to master. I set the default limit a bit arbitrarily to 100 requests per connection before reallocation, so let me know if we should adjust it. I have tested the pool implementation itself, but I would like to hear if this solves your real-world use case.

pm-mck commented 2 years ago

Thank you for the fast turn around. I will be testing this early next week.

pm-mck commented 2 years ago

This update looks good, memory utilization % is smooth, with the timer it was a bit sawtoothed.

Interestingly, this update has revealed a peculiar behavior in some of our tests which depend on the database, I'll need to investigate more but I think this can be called good.

I do think an update to allow the number of requests per connection to be provided by the caller would be a welcome addition at some point in the future. Thanks again for the quick response to this.

paurkedal commented 2 years ago

You can already specify the limit by passing max_use_count to connect_pool. The documentation is not updated yet, but the signature can be found in caqti/lib/caqti_connect_sig.mli.

smorimoto commented 1 year ago

It seems that this is not yet available on caqti-lwt on 1.9.0, which is published in the opam repository. Can you release a new version?

https://github.com/paurkedal/ocaml-caqti/blob/55ab7a2851694f08c62cbb6438c681ebb565636d/caqti/lib/caqti_connect.ml#L107-L109

smorimoto commented 1 year ago

For a while, I will use pin-depends to pin every package, but it will be better if it's re-packaged properly and fixed upstream.

pin-depends: [
  [
    "caqti.dev"
    "git+https://github.com/paurkedal/ocaml-caqti.git"
  ]
  [
    "caqti-driver-postgresql.dev"
    "git+https://github.com/paurkedal/ocaml-caqti.git"
  ]
  [
    "caqti-lwt.dev"
    "git+https://github.com/paurkedal/ocaml-caqti.git"
  ]
  [
    "caqti-type-calendar.dev"
    "git+https://github.com/paurkedal/ocaml-caqti.git"
  ]
]
paurkedal commented 1 year ago

@smorimoto Good you found the workaround for now. The automatic draining feature is scheduled for the 2.0.0 release, as it's tied up with some internal restructuring. I am working on the release, but there are a few non-trivial bits left.

smorimoto commented 1 year ago

Actually, it didn’t work. I decided to fix it in a different layer, so at least it’s not a big problem for me. I saw your work in the process of that work; it was quite a lot. Thank you as always!