Before/after-execute query callbacks

mikemintz commented 9 years ago

It would be nice if developers could add before-execute and after-execute callback functions to their queries in the whitelist. Some use cases:

Send push notifications after a row is inserted
Record analytics for a given action
Record some state in the user's session object
Modify a query before inserting into database (e.g. password hashing could happen in before-execute instead of separate ajax call).

Technically, before-execute is already possible using the .validate() method. Are there use cases where we'd want a separate before-execute callback with different semantics? Some thoughts:

The .validate() function must return a truthy value or the query will be rejected. Having non-validation callbacks end in return true might clutter things or lead to bugs.
Should we be able to modify the query before forwarding it to RethinkDB? Would there be issues if rethinkdb-websocket-client got results for a query different from what it thought it executed?

For after-execute, there are two possible variants: one that gets called right after the query is sent to RethinkDB, and another that gets called right after the query response is received from RethinkDB. The former is trivial to implement, and the latter requires parsing RethinkDB responses and tracking query tokens.

bkniffler commented 9 years ago

Hey, that would be a great addition. Maybe we could attach these callbacks to either one, multiple or all table calls.. some kind of express-like syntax:

wsListen.beforeInsert("*", function(req, next){ console.log(req.table, req.data); next(); })
wsListen.afterInsert("*", function(req, next){ console.log(req.table, req.data); next(); })

I understand that the current whitelist syntax replicates the queries from clients and makes for a rock-solid security, but maybe we could have the option to not only rely on a relatively strict whitelist, but also on middlewares. Its just an idea, I got to say I still like the idea of query whitelists, but I also feel it will not be able to handle the requirements of a bigger data-driven application.

wsListen.beforeInsert("*", function(req, next){ if(req.data.isNew) req.data.userId = req.session.userId; next(); })
wsListen.beforeFilter("*", function(req, next){ req.query.userId = req.session.userId; next(); })

I do not want to force the express syntax into your project, but it might illustrate my idea and it's gotten kind of a common way to handle queries. Maybe you have an idea how to implement it more elegantly?

I'd really hope for a more modular and flexible approach.

mikemintz commented 9 years ago

Definitely, I think it would be great to define behaviors for groups of queries. If large groups of queries share the same handling code, and the developer wants to be able to quickly change the handlers (e.g. add logging without changing 100 whitelist entries), it would be nice to do so globally.

Adding this middleware concept should probably happen after implementing callbacks, since users will want to define per-query callbacks too (e.g. push notifications).

You allude to some other interesting points too. In your example, you can determine query types (insert vs filter) and table name. Determining query types could be challenging since rethinkdb has so many different operators, and extracting the table name appears to be impossible for a before callback, e.g.:

r.table('items').forEach(function (x) {return r.table(x('name')).insert({foo: 'bar'})})

I think the only way to reliably provide this information is to restrict the allowed queries to those with a simple structure, which I think is meteor's approach. But then we lose a lot of the advantages of RethinkDB. This came up in https://github.com/mikemintz/rethinkdb-websocket-server/issues/1 too, so it's probably going to be an ongoing conversation.

Of course, the application developer could restrict allowed queries to a simple structure, and still generalize a lot of queries to one entry, like this:

RQ(RQ.INSERT(
  RQ.TABLE(RQ.ref('table')),
  RQ.ref('data')
))

Your example also uses next() instead of return true. I think something like next() would be nice if it's common for callbacks to modify the request object before passing it along, but the return feels more natural to me otherwise.

bkniffler commented 9 years ago

Well, return would be more natural indeed, as long as there isn't any async stuff going on (e.g. validate against other objects in database).

Unfortunately, intelligently designing a proper API isn't within my scope, I'm not yet experienced with rethinkDB. I hope to find the time to learn a bit more about it in the next couple of days to be able to give some valuable input.

My examples were kinda naive. What I would wish from a future API is:

create hooks/callbacks for specific tables and query types
access session infos
interrupt a query with errors
change a query before execution and/or change the results
allow for async operations
maybe attach and detach hooks/callbacks at any time

mikemintz commented 9 years ago

Well, return would be more natural indeed, as long as there isn't any async stuff going on (e.g. validate against other objects in database).

We could have query callbacks return either a boolean or a Promise for a boolean, like .validate() does now. That allows for async operations. I think that's nicer than passing the next callback around, unless there are any other benefits to it.

Those ideas for the API sound quite good. The "specific tables and query types" stuff will be complicated like I alluded to before, but they'd be nice to have if feasible.

Changing a query before execution shouldn't be too hard, but changing the results will be a much bigger refactor since we currently don't do any processing of results. Can you think of a use case where a developer would benefit from changing the results on the server-side that couldn't be resolved with a rethinkdb query transformation like map?

mikemintz / rethinkdb-websocket-server

Before/after-execute query callbacks #2