liveh2o / active_remote

Active Remote provides Active Record-like object-relational mapping over RPC. It's Active Record for your platform.
MIT License
63 stars 23 forks source link

Add support for eager loading #94

Open liveh2o opened 2 years ago

liveh2o commented 2 years ago

Consider an API that publishes Transaction#user_identifer and Transaction#account_identifer. These fields are delegated to the associated user and account (respectively) because they are not part of the transaction model. This works, but results in RPC calls to get the user and account for each record returned when searching for transactions (i.e., 2N+1 RPC calls). This is especially impactful when searching for large sets of transactions (i.e., 1000), which this API also supports.

Potential solution

Active Record solves this problem with eager loading: after loading records, associations are eager loaded with a single query per associated record (i.e., Transaction#account, etc.). With two associations, this results in 3 queries rather than 2N+1 queries.

For API under consideration, we added an optimized search method to the transaction model that eagerly loads the user and accounts when searching for transactions:

def self.optimized_search(args = nil)
  # Eager load user and account to avoid 2N + 1 RPC calls
  search(args).tap do |records|
    return records if records.empty?

    # Requests must be scoped to a single user, so we can assume all transactions have the same :user_guid
    user = User.find(:guid => records.first.user_guid)

    account_guids = records.map(&:account_guid)
    account_guids.uniq!
    accounts = Account.search(:guid => account_guids, :user_guid => user.guid)
    accounts_by_guid = accounts.inject({}) do |hash, account|
      hash[account[:guid]] = account
      hash
    end

    records.each do |record|
      record.account = accounts_by_guid[record.account_guid]
      record.user = user
    end
  end
end

This is clearly specific to the API setup (i.e., assuming all transactions belong to the same user), but the pattern of tapping the results and setting the association records this way should be possible to generalize.

For the API under consideration, using the optimized search is nearly 38 times faster than search:

Running transaction search benchmarks:
Warming up --------------------------------------
              search     1.000  i/100ms
    optimized_search     1.000  i/100ms
Calculating -------------------------------------
              search      0.046  (± 0.0%) i/s -      1.000  in  21.954028s
    optimized_search      1.712  (± 0.0%) i/s -     35.000  in  20.482257s

Comparison:
    optimized_search:        1.7 i/s
              search:        0.0 i/s - 37.60x  (± 0.00) slower

Considerations

Ideally, this would not require an additional method to be created in order to tap the results and eager load the associations. Something like Active Record's:

::Transaction.search(:user_guid => "USR-123").eager_load(:user, :account)

That's not currently possible with the results being a simple Ruby array. Extending array or creating a special result set that provides this eager loading could work.

Additionally, eager loading will need to respect defined scope_keys of the association being eager loaded (which will be tricky if the scope key values are not available on the main object (i.e., Transaction would need both #account_guid and #user_guid fields in the above example).