rethinkdb / rethinkdb

The open-source database for the realtime web.
https://rethinkdb.com
Other
26.72k stars 1.86k forks source link

Feature request: Intelligent prefetching #4988

Open jasonkuhrt opened 8 years ago

jasonkuhrt commented 8 years ago

Consider the following:

const getDevice = (r, id) => (
  r
  .table('devices')
  .get(id) /* 1 */
  .merge({
     relations: {
/* 2 */ observing: r.table('relations').getAll(id, { index: 'observer' }).coerceTo('array'),
/* 3 */ observers: r.table('relations').getAll(id, { index: 'subject' }).coerceTo('array'),
    }
  })
)

It is a join-based query that is able to execute queries 1 / 2 / 3 in parallel (I believe). Say the document it returns is a device. If we want to batch-query devices then the RQL request may be:

const getDevices = (r, ids) => (
  r
  .table('devices')
  .getAll(r.args(ids)) /* 1 */
  .merge((device) => ({
     relations: {
/* 2 */ observing: r.table('relations').getAll(device('id'), { index: 'observer' }).coerceTo('array'),
/* 3 */ observers: r.table('relations').getAll(device('id'), { index: 'subject' }).coerceTo('array'),
    }
  }))
)

This new RQL has lost a degree of parallelism because each inner getAlls waits for its outer getAll to complete its fetch. To help communicate that this dependency is arbitrary and not technically necessary in this case, here is another RQL leading toward the same result:

const getDevicesFullyParallel = (r, ids) => (
  Promise.all([
/* 1 */ r.table('devices').getAll(r.args(ids)), 
/* 2 */ Promise.all(ids.map((id) => r.table('relations').getAll(id, { index: 'observer' }) )), 
/* 3 */ Promise.all(ids.map((id) => r.table('relations').getAll(id, { index: 'subject' }) )), 
  ])
  .then(/* ...ouch */)
)

Of course this variation is not satisfactory since:

  1. We are now making 3 client requests (2 more than before)
  2. The logic in the .then(...) callback is going to be painful. We've lost the very useful leveraging of RQL to stitch our document schemas together.

So, assuming I've clearly explained the issue, the feature request is some sort of intelligent pre-fetching when possible or otherwise some new/other RQL function to explicitly do it.

Prior discussion with Michael Lucy / Henrik Andersson / Daniel Mewes that led to this issue with can be traced back from here: https://groups.google.com/forum/?fromgroups=#!topic/rethinkdb/mvrMyqpyPPY.

Thanks!

mlucy commented 8 years ago

Thanks for the detailed issue!