mattkrick / cashay

:moneybag: Relay for the rest of us :moneybag:
MIT License
453 stars 28 forks source link

create cashay-server for realtime updates #149

Open mattkrick opened 7 years ago

mattkrick commented 7 years ago

The problem is not everyone uses RethinkDB for reactivity & even if they do, they're more or less limited to single-table subscriptions. This makes sense, since it can get pretty expensive to simulate a join & subscribe to it. Apollo offers something that's in the experimental phase, but it's, uh, not robust. Here's the blueprints for how to make something that can match (or exceed) rethinkdb performance while allowing for cross-table subs.

Problem:

Solution:

The magic of the bump function is that it contains really inexpensive logic (in this case, mutatedDoc.votes > minVotes). Without it, we'd have to re-run each original database function to determine if F replaced E. This is critical because every time upvote gets called, we're gonna have to run through every channel with the getTop5Posts topic. A single Float64 comparison should be cheap enough that JS will work at scale. SocketCluster already contains a message bus, but to save a function on each channel, we'll have to use a key/value store like redis to save the bumpFnVars on each channel.

For the next example, let's try a form of CmRDT. Say we have hell world and we want to correct it. We send: updateContent(changes: {id: 'A', pos: 4, val: 'o'}) to make it hello world. Since it's a C_m_RDT, We'll never have the full state, rather just a transform. That means our mutation will have to adjust the db with just this info. Then, we forward the operational transform onto the client & trust that the client knows how to do it. Since the updateContent mutation can never change the docs that are returned by getTop5Posts, our bumpFn is easy:

(idArr) => (mutatedDoc) => {
  if (idArr.includes(mutatedDoc.id)) {
    return {
      transform: mutatedDoc
    }
  }
} 

For super fine grained performance tweaking, we could consider establishing a discrete channel just for that field: content/content123, but that would be very application specific & could result in a performance net-loss.

A fringe benefit of all of these things is that it means we don't always necessarily need to use a websocket between the client and the server. For example, I can take the return values of the bump functions and store them away in a key/value store under the JWT. Then, when the client long-polls for updates, I just send the array of changes. That means in 1 network request, they get a whole bunch of fresh new info without having to request it from each individual query.

mattkrick commented 7 years ago

additional thought: suppose each query can take in 2 additional args:

With these 2 things, we can greatly reduce the network payload. For example, I subscribe to team members. Then i unsubscribe, then I subscribe again like: teamMembers(teamId: 'team123', ids: ['A', 'B', 'C'], updatedAt: Yesterday) Now, I run the query. When it resolves from the DB, I get something like this:

const teamMembers = [
  {
    id: 'A',
    updatedAt: 'last week',
    name: 'matt'
  },
  {
    id: 'B',
    updatedAt: 'today',
    name: 'jordan'
  },
  {
    id: 'D',
    updatedAt: 'last week'
  }
]

First, we intersect the result with the ids. On the left side, we have D. On the right side, we have C. In the intersection, we have A,B. WIthin that intersection, we see that A hasn't been updated for a week, so we exclude it. B has been updated since we have recently seen it, so we need to include it. So, we return a result like:

return {
  removeDocId: 'C',
  addDoc: {
    id: 'D',
    updatedAt: 'last week'
  },
  updateDoc: {
    id: 'B',
    updatedAt: 'today',
    name: 'jordan'
  },
};

Now, let's assume we cache this locally & then they refresh the page. The server doesn't even need to reply!

dustinfarris commented 7 years ago

This looks freakin awesome. I love the diffs.