Shopify / graphql-batch

A query batching executor for the graphql gem
MIT License
1.42k stars 107 forks source link

How do you paginate responses when using AssociationLoader? #114

Closed anujbiyani closed 4 years ago

anujbiyani commented 5 years ago

What's the recommended way to paginate a response when using AssociationLoader?

I've tried the following:

field :posts_connection, Types::PostType.connection_type, null: false
def posts_connection
  AssociationLoader.for(model: User, association_name: :posts).load(object)
end

but this results in two SQL queries being generated, one with my relay-defined pagination queries and one without.

I read through https://github.com/Shopify/graphql-batch/issues/26 and https://github.com/rmosolgo/graphql-ruby/pull/425 but am unclear on how to use the lazy execution API to solve this. Adding the following to my schema did not help:

lazy_resolve(AssociationLoader, :perform)
rmosolgo commented 4 years ago

@anujbiyani and I looked over this just now. My takeaway is that the example AssociationLoader would work great for belongs_to or has_one relationships, where load-by-id operations can be batched up, but it doesn't work for has_many relationships. Is that right?

That is, I can't think of how has_many loads could really be batched. What kind of SQL would it emit in that case?

eapache commented 4 years ago

If you're on a sufficiently recent version of MySQL/Postgres/whatever you can use window functions, roughly similar to the explanation at https://www.the-art-of-web.com/sql/partition-over/.

anujbiyani commented 4 years ago

Ah okay, thanks.

fabn commented 3 years ago

@anujbiyani how did you solved this? Could you provide an example of has_many with pagination?

anujbiyani commented 3 years ago

@fabn The solution is to simply not use AssociationLoader with a has_many. Just

field :posts_connection, Types::PostType.connection_type, null: false
def posts_connection
  object.posts
end

is what you want.

If you're loading many posts for many users, it does introduce an N+1 but there's no way around that. If you're loading many posts for one user, then there's nothing to batch up so AssociationLoader doesn't provide any benefit.

Let's walk through the two possible paths:

Loading paginated posts for many users

Here's a sample graphql query and corresponding sql for batch loading without pagination:

query {
  users {
    posts {
      id
    }
  }
}

select * from users
select * from posts where user_id in (1,2,...)

If you try and paginate that, the sql query doesn't really makes sense:

query {
  users {
    posts(first: 5) {
      nodes { id }
    }
  }
}

select * from users
select * from posts where user_id in (1,2,...) limit 5

The limit would get applied to the posts, but that would get you only 5 posts overall not 5 posts per user.

Loading paginated posts for one user

In this case, you could use AssociationLoader and paginate. If you do, you get the extra query that I mentioned in OP. BUT AssociationLoader isn't actually helpful because there's only one set of posts to fetch, so there's nothing to batch.

query {
  user { # singular, just loading one user here
    posts {
      id
    }
  }
}

select * from users where id = 1
select * from posts where user_id = 1

Now look at it paginated:

query {
  user { # still singular
    posts(first: 5) {
      nodes { id }
    }
  }
}

select * from users where user_id = 1
select * from posts where user_id = 1 # this is the extra query pointed out in my OP
select * from posts where user_id = 1 limit 5

There's no batching here since it's just loading posts for one user, so AssociationLoader doesn't get you anything.

I've rewritten this comment 5 times as I went through the scenarios over and over, so apologies if this isn't very clear. I went as far as proving this out in a sample app. There are tests you can run (setup the app with bundle install and rails db:setup) which, when you look at the logs, show that paginating and batch loading yields basically the same queries as paginating and not-batch loading, with the exception of the one errant extra query.

So the solution is to just not batch load a has_many association.

grjones commented 1 year ago

@anujbiyani

select * from posts where user_id in (1,2,...) limit 5

This should just be transformed into UNIONs for each user_id including a limit 5 for each select. One SQL query really ought to be possible.

letiesperon commented 6 months ago

I would also like to see an example of how to solve this. I couldn't figure it out so I've created this custom loader for this purpose, but I have the feeling it can be prettier and it currently does not support has_many through relations.

Sharing in case it helps anybody, and in case anybody spots anything that might be wrong with it:

# Loader for paginated associations without N+1s when called from a GraphQL field.
#
# DISCLAIMER: It does NOT work with polymorphic associations or has_many :through associations.
#
# Example usage in `BookType`:
#
# def kept_published_comments(page:, per_page:)
#   Loaders::PaginatedAssociationLoader.for(Book, :published_comments, :kept, order: 'created_at DESC', page: 2, per_page: 10).load(object)
# end

module Loaders
  class PaginatedAssociationLoader < GraphQL::Batch::Loader
    def initialize(model, association_name, *scope_names,
                   order:,
                   page: 1,
                   per_page: 25
      super()

      @model = model
      @association_name = association_name
      @scope_names = scope_names
      @order = order
      @page = page || 1
      @page = 1 if page <= 0
      @per_page = per_page

      validate_association_exists
      validate_scopes_exist
      validate_not_has_many_through
    end

    def perform(records)
      association_query = association_klass.where(join_field => records)

      association_query = association_query.merge(reflection.scope) if reflection.scope.present?

      association_query = apply_scopes(association_query)
      association_query = apply_pagination(association_query)

      association_query_grouped = association_query.group_by { |record| record[join_field] }

      records.each do |record|
        record_key = record[reflection.active_record_primary_key]
        associated_records = association_query_grouped[record_key] || []

        fulfill(record, associated_records)
      end
    end

    private

    attr_reader :model, :association_name, :scope_names, :order, :page, :per_page

    def reflection
      @reflection ||= model.reflect_on_association(association_name)
    end

    def validate_association_exists
      return if reflection

      raise ArgumentError, "No association #{association_name} on #{model}"
    end

    def validate_scopes_exist
      scope_names.each do |scope_name|
        unless reflection.klass.respond_to?(scope_name)
          raise ArgumentError, "The associated class does not respond to '#{scope_name}'"
        end
      end
    end

    def validate_not_has_many_through
      return unless reflection.is_a?(ActiveRecord::Reflection::ThroughReflection)

      raise ArgumentError, "The association #{association_name} on #{model} " \
                           "is a 'has_many :through' association which is not supported"
    end

    def apply_scopes(association_query)
      scope_names.each do |scope_name|
        association_query = association_query.merge(association_klass.send(scope_name))
      end

      association_query
    end

    def apply_pagination(association_query)
      records_sql = association_query
                    .select("#{association_table}.* AS #{association_table}, " \
                            "ROW_NUMBER() OVER (PARTITION BY #{join_field} " \
                            "ORDER BY #{order}) as row_number")
                    .to_sql

      min_record_index = ((page - 1) * per_page)
      max_record_index = min_record_index + per_page

      association_klass
        .from("(#{records_sql}) AS #{association_table}")
        .where('row_number > ? AND row_number <= ?', min_record_index, max_record_index)
    end

    def association_table
      reflection.klass.table_name
    end

    def association_klass
      reflection.klass
    end

    def join_field
      reflection.join_primary_key
    end
  end
end