rails-on-services / apartment

Database multi-tenancy for Rack (and Rails) applications
387 stars 159 forks source link

Proposal: Connection Resetting Middleware for SET search_path #302

Open Amnesthesia opened 3 days ago

Amnesthesia commented 3 days ago

Steps to reproduce

Use PgBouncer or an equivalent (e.g AWS RDS Proxy) in front of your database with Rails

Expected behavior

PgBouncer to be able to reuse connections efficiently

Actual behavior

Near 1-1 connections because connections become session-pinned when SET is encountered

System configuration

Proposed solution

class Apartment::ResetConnection
  def initialize(app)
    @app = app
  end

  def call(env)
    @app.call(env)
  ensure
    # Reset search path to allow connections to be unpinned if the SET pinned them
    Apartment.connection.execute('RESET search_path')
  end
end

Alternatively, this could be done in the postgres adapter:

  class PostgresqlAdapter < AbstractAdapter
      ....
      #   Reset schema search path to the default schema_search_path
      #
      #   @return {String} default schema search path
      #
      def reset
        @current = default_tenant
        Apartment.connection.schema_search_path = full_search_path
        Apartment.connection.execute('RESET search_path')
      end

      ....
  end

I'm not super familiar with how this works in Apartment, so thought I'd open an issue with the proposal rather than a PR. What do you think, would this be possible? Would it cause any issues?

mnovelo commented 3 days ago

hmmm, I'd think that a SET search path would be issued whenever a query was about to be sent through a connection, so it doesn't matter what the previous value of search path was. This is why I don't think there's been much done to issue an additional SQL statement to reset the search path on the connection. Can you tell me more about your use case? When are you specifying a tenant, i.e., the search path?

Amnesthesia commented 2 days ago

@mnovelo We set the tenant during authentication, i.e after parsing the JWT token and authenticating a user. So, I agree that it doesn't matter what the previous value of search path was, but when RDS Proxy comes across a SET command (SET search_path, SET timezone, doesn't matter). Here's some info on that: https://jpcamara.com/2023/04/12/pgbouncer-is-useful.html

Amazon has a similar tool to PgBouncer called RDS Proxy, and it has a feature called “connection pinning”. If it detects a statement that is incompatible with transaction mode, it will automatically hold that connection for that client for the duration of their session.

This is both highly useful and simultaneously problematic. It means query behavior is consistent with your expectations (🙌🏼) but also that you can silently kill all concurrency benefits (😑). If enough queries are run that trigger connection pinning, all of a sudden you may throttle your throughput. But it does give you an escape hatch for safely running statements which are not transaction compatible without having to jump through any hoops.

The problem is that this leads to pretty much all connections being session-pinned because of Apartment setting the search path, and not resetting it

Screenshot 2024-11-19 at 11 00 25
mnovelo commented 2 days ago

Oh, I see. Then, something like what you're proposing would be helpful for people using either AWS RDS Proxy or PgBouncer. I'd prefer for this to be something that has to be enabled by configuration. This would execute an unnecessary query for those not using AWS RDS Proxy or PgBouncer, so I'd want this behavior disabled by default. I'm open to names for this configuration, like enforce_search_path_reset or similar.

Amnesthesia commented 2 days ago

@mnovelo I see what you're saying, on the other hand, wouldn't it make sense that the reset method actually resets things that it has set, such as search path? Although it executes an additional query, it's not a very heavy one, and it really just cleans up what it did by setting the search path in the first place. I think it seems like reasonable cleanup?

I think enforce_search_path_reset sounds like a good name for such a configuration, or reset_search_path_before_disconnect

mnovelo commented 1 day ago

Let's go with enforce_search_path_reset with a default of false. While I see what you're saying about cleanup, it's similar to the tenant_presence_check, which is also an arguably good practice to have enabled. What we've found in our production environment is that such an additional safeguard was not worth the overhead of an extra query on each request. At this time, I don't see this being a necessity outside of connection pooling solutions like RDS Proxy and PgBouncer, which is why I'm defaulting it to false. I'm happy to default it to true in the future if more people find it beneficial.