rails / rails

Ruby on Rails
https://rubyonrails.org
MIT License
56.07k stars 21.68k forks source link

ActiveRecord using extended query protocol even when prepared_statements is false #40207

Closed awlamb closed 3 years ago

awlamb commented 4 years ago

Steps to reproduce

In attempting to migrate from pgbouncer to AWS hosted RDS Proxy, we noticed that many/all of our connections to the proxy were pinned(https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/rds-proxy.html#rds-proxy-pinning) even though prepared_statements is set to false in database.yml

To recreate, I made a fresh rails install

rails new pgtest -d postgresql

with a database.yml file as such

development:
  <<: *default
  url: postgres://localhost:5432/pgdemo_development
  prepared_statements: false

and created a Post model with some columns and created a row via console

Post.create(first_name: 'asdf', last_name: 'asdf', message: 'asdf')

and examine the wireshark output to the server

Expected behavior

When prepared_statements=false, the SQL query sent to the PostgreSQL server should be using the simple query protocol.

Frame 9: 275 bytes on wire (2200 bits), 275 bytes captured (2200 bits) on interface lo0, id 0
Null/Loopback
Internet Protocol Version 6, Src: ::1, Dst: ::1
Transmission Control Protocol, Src Port: 63343, Dst Port: 5432, Seq: 12, Ack: 18, Len: 199
PostgreSQL
    Type: Simple query
    Length: 198
    Query: INSERT INTO "posts" ("first_name", "last_name", "message", "created_at", "updated_at") VALUES ('asdf', 'asdf', 'asdf', '2020-09-09 07:57:45.392053', '2020-09-09 07:57:45.392053') RETURNING "id"

Actual behavior

The extended query protocol is used

Frame 5: 315 bytes on wire (2520 bits), 315 bytes captured (2520 bits) on interface lo0, id 0
Null/Loopback
Internet Protocol Version 6, Src: ::1, Dst: ::1
Transmission Control Protocol, Src Port: 63087, Dst Port: 5432, Seq: 12, Ack: 18, Len: 239
PostgreSQL
    Type: Parse
    Length: 201
    Statement: 
    Query: INSERT INTO "posts" ("first_name", "last_name", "message", "created_at", "updated_at") VALUES ('asdf', 'asdf', 'asdf', '2020-09-09 07:42:50.441317', '2020-09-09 07:42:50.441317') RETURNING "id"
    Parameters: 0
PostgreSQL
    Type: Bind
    Length: 14
    Portal: 
    Statement: 
    Parameter formats: 0
    Parameter values: 0
    Result formats: 1
PostgreSQL
    Type: Describe
    Length: 6
    Portal: 
PostgreSQL
    Type: Execute
    Length: 9
    Portal: 
    Returns: all rows
PostgreSQL
    Type: Sync
    Length: 4

Investigation

In https://github.com/rails/rails/blob/6-0-stable/activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb#L675 the exec_params method on the connection object is being called when prepared_statements is false, and that looks like it's being defined at the pg gem level https://github.com/ged/ruby-pg/blob/9cf9d8aabd4a1a95ef2bc7c61054fc2cc663072b/ext/pg_connection.c#L1298

Noticing that in that function is some code that preserves the simple query protocol functionality (the pgconn_exec function) https://github.com/ged/ruby-pg/blob/9cf9d8aabd4a1a95ef2bc7c61054fc2cc663072b/ext/pg_connection.c#L1312-L1319

This brings us back to exec_params, where it looks like the variable type_casted_binds is being set as an empty array. Since it's being set as an empty array, it is never nil and the function for the simple query protocol is never called. The helper method type_casted_binds(binds) looks to always return an array.

I'm not sure if this was on purpose, but it seems to be to be unintentional. I have a small fix for it, and that's how i got the query in the expected behavior section

System configuration

Rails version: 6.0.3.2

Ruby version: 2.6.5

eugeneius commented 4 years ago

This behaviour was introduced in https://github.com/rails/rails/commit/7df68a300c9395e3edf8c603b6fea3db9eaff003, and it looks like it was intentional. The code has since been refactored so that it now looks accidental, but given that it was originally part of a security fix, I think we need to understand why it happened.

@rafaelfranca, I don't suppose you remember why that line was changed?

awlamb commented 4 years ago

Thanks @eugeneius! The extended protocol does offer extra protection against SQL injection due to it only allowing one SQL statement at a time, so perhaps that was an added but unreferenced bonus here?

anton-kachan-dev commented 3 years ago

activerecord (= 5.0.6) pg (= 0.21.0)

I had the same problem. I monkey patched the ActiveRecord::ConnectionAdapters::PostgreSQLAdapter. And it was resolved with the code below.

require 'active_record/connection_adapters/postgresql_adapter'

class ActiveRecord::ConnectionAdapters::PostgreSQLAdapter
  def exec_no_cache(sql, name, binds)
    log(sql, name, binds) { @connection.async_exec(sql) }
  end

  protected

  def configure_connection
    if @config[:encoding]
      @connection.set_client_encoding(@config[:encoding])
    end
    # self.client_min_messages = @config[:min_messages] || 'warning'
    self.schema_search_path = @config[:schema_search_path] || @config[:schema_order]
    # Use standard-conforming strings so we don't have to do the E'...' dance.
    # set_standard_conforming_strings

    # If using Active Record's time zone support configure the connection to return
    # TIMESTAMP WITH ZONE types in UTC.
    # (SET TIME ZONE does not use an equals sign like other SET variables)
    # if ActiveRecord::Base.default_timezone == :utc
    #   execute("SET time zone 'UTC'", 'SCHEMA')
    # elsif @local_tz
    #   execute("SET time zone '#{@local_tz}'", 'SCHEMA')
    # end

    # SET statements from :variables config hash
    # http://www.postgresql.org/docs/current/static/sql-set.html
    variables = @config[:variables] || {}
    variables.map do |k, v|
      if v == ':default' || v == :default
        # Sets the value to the global or compile default
        execute("SET SESSION #{k} TO DEFAULT", 'SCHEMA')
      elsif !v.nil?
        execute("SET SESSION #{k} TO #{quote(v)}", 'SCHEMA')
      end
    end
  end
end

ActiveRecord::Base.establish_connection(
    adapter: "postgresql",
    database: 'YOUR_DATABASE',
    username: '',
    password: '',
    prepared_statements: false,
    host: 'YOUR_HOST',
    port: '5432')
awlamb commented 3 years ago

@KachanAnton350504 Interesting solution, thank you! Would be interested to hear from @rafaelfranca about the intent of the change. Ideally i'd prefer to not monkey patch anything 👍

awlamb commented 3 years ago

Just wanted to follow up here in case it was lost in the holiday shuffle. Any thoughts on this @rafaelfranca

rails-bot[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not been commented on for at least three months. The resources of the Rails team are limited, and so we are asking for your help. If you can still reproduce this error on the 6-1-stable branch or on main, please reply with all of the information you have about it in order to keep the issue open. Thank you for all your contributions.

katyho commented 2 years ago

Just wanted to +1 this issue - we're also trying to put in a connection pooler and are running into the same issue, and would prefer not to monkey patch if possible

lobsterdore commented 2 years ago

Same here @katyho, it's very tricky to use RDS Proxy with Rails at the moment due to this issue.

stephenwiebe commented 1 year ago

@eugeneius Can this issue be re-opened? It doesn't seem possible to switch to simple query protocol at all right now.

Azdaroth commented 1 year ago

Has any had any luck in Rails 7 with at least monkey patching and running RDS Proxy? seems like even the suggested monkey patches don't help.

Edit: Eventually managed to get it working, but would be great to avoid such an excessive monkey patching.

cenavarro commented 11 months ago

@Azdaroth could you share the monkey patch(es) you implemented? We are on Rails 7 and also having the same issue with RDS Proxy.

Azdaroth commented 11 months ago

@cenavarro Haha, wanted to leave it for the upcoming blog post, but anyway, let me share it already, early public launch ;)

Works with Rails >= 6.0, put it in the initializer

# frozen_string_literal: true

return unless ENV.fetch("APPLY_CONFIG_FOR_RDS_PROXY", "false") == "true"

Encoding.default_internal = nil

class ActiveRecord::ConnectionAdapters::PostgreSQLAdapter
  private

  def exec_no_cache(sql, name, binds, async: false)
    materialize_transactions
    mark_transaction_written_if_write(sql)

    # make sure we carry over any changes to ActiveRecord.default_timezone that have been
    # made since we established the connection
    update_typemap_for_default_timezone

    type_casted_binds = type_casted_binds(binds)
    log(sql, name, binds, type_casted_binds, async: async) do
      ActiveSupport::Dependencies.interlock.permit_concurrent_loads do
        # -- monkeypatch --
        # to use async_exec instead of exec_params if prepared statements are disabled

        if ActiveRecord::Base.connection_db_config.configuration_hash.fetch(:prepared_statements, "true").to_s == "true"
          @connection.exec_params(sql, type_casted_binds)
        else
          @connection.exec(sql)
        end
        # -- end of monkeypatch --
      end
    end
  end

  protected

  def configure_connection
    # if @config[:encoding]
    #   @connection.set_client_encoding(@config[:encoding])
    # end
    # self.client_min_messages = @config[:min_messages] || "warning"
    self.schema_search_path = @config[:schema_search_path] || @config[:schema_order]
    #
    # # Use standard-conforming strings so we don't have to do the E'...' dance.
    # set_standard_conforming_strings
    #
    # variables = @config.fetch(:variables, {}).stringify_keys
    #
    # # If using Active Record's time zone support configure the connection to return
    # # TIMESTAMP WITH ZONE types in UTC.
    # unless variables["timezone"]
    #   if ActiveRecord::Base.default_timezone == :utc
    #     variables["timezone"] = "UTC"
    #   elsif @local_tz
    #     variables["timezone"] = @local_tz
    #   end
    # end
    #
    # # Set interval output format to ISO 8601 for ease of parsing by ActiveSupport::Duration.parse
    # execute("SET intervalstyle = iso_8601", "SCHEMA")
    #
    # # SET statements from :variables config hash
    # # https://www.postgresql.org/docs/current/static/sql-set.html
    # variables.map do |k, v|
    #   if v == ":default" || v == :default
    #     # Sets the value to the global or compile default
    #     execute("SET SESSION #{k} TO DEFAULT", "SCHEMA")
    #   elsif !v.nil?
    #     execute("SET SESSION #{k} TO #{quote(v)}", "SCHEMA")
    #   end
    # end
  end
end

The stuff that got commented out needs to be moved to RDS proxy (conforming strings, timezone, encoding... good to review what got commented out or set it based on RDS logs regarding what caused session pinning). Once you are ready to switch, just switch the ENV var (APPLY_CONFIG_FOR_RDS_PROXY). And maker sure that you disable prepared statements in database.yml (prepared_statements: <%= ENV.fetch("ENABLE_PREPARED_STATEMENTS", true) %>)

cristielciu commented 2 months ago

@Azdaroth I'm curious how did the monkey patch survive over this period of time? My main concern is that sometime in the future ActiveRecord will decide to change the #configure_connection method and add/remove some code there. Are you checking on each activerecord gem upgrade that this code never changes?

Azdaroth commented 2 months ago

@cristielciu Unfortunately this is necessary to check before every Rails upgrade, as with any monkeypatch :/