alakajam-team / alakajam

Website powering the Alakajam! game making community
https://alakajam.com/
Other
28 stars 13 forks source link

Enforce proper use of DB transactions / try using bookshelf-cls-transaction #269

Open mkalam-alami opened 6 years ago

mkalam-alami commented 6 years ago

~We're suffering a bit from node-sqlites limited transaction support, and it shouldn't steer us away from making good (and generous) use of transactions.~

~Let's see if we can integrate better-sqlite3 to solve this.~

~If this fails and we really can't find another solution... We might have to take a more radical decision.~


EDIT: Following @ttencate's post, the issue is actually more about enforcing proper use of transactions, ie. preventing to do non-transactional queries inside of a transaction. I'm renaming the issue accordingly.

ttencate commented 6 years ago

I don't know... generally, if you're inside a transaction, don't you want to do all your queries in that transaction, and anything that happens outside it is a bug? Of course it's a very sucky way to find out about such bugs (long timeout, no useful stack traces), but still, the bug needs to be fixed.

The root cause might be that db (the Bookshelf instance) is essentially mutable global state.

After thinking about this a bit, it seems to me we have three orthogonal issues to solve here:

1. Creating a transaction.

This is easy, using a small Express middleware.

2. Making the transaction available where it's needed.

We can publish it to controllers through an attribute of the req object, but how does it get to services (possibly several layers down)? At its simplest, we simply pass it into every service function, but this quickly gets tedious.

The Sequelize ORM has a solution, based on the continuation-local-storage package. We might be able to use that package as well.

I'm a bit unclear on how it works under the hood – how does it know, even from an asynchronous callback function, what the original context was in which the function was created? The most informative I could find was this presentation, and various suggestions that it uses Node magic whose documentation I can't find. Anyhow, if it works, it works.

3. Ensuring that the transaction is actually used everywhere.

This is the tricky part.

With plain Knex it would be straightforward. Transaction objects in Knex are full-featured Knex connection objects themselves. So we could exclusively use that transaction object, and forbid accessing db directly. The db.js module would not export the raw Knex connection anymore, but just a runInTransaction function, which accepts a function (transaction) callback.

With Bookshelf however, the model classes are created on the Bookshelf instance, and are not also attributes on the transaction objects (at least, no such thing is mentioned in the docs). In other words, I don't think you can write transaction.User.where(...).fetch(); you have to write models.User.where(...).fetch({ transacting: transaction }). If you forget the transacting bit, immediate SQLite mayhem and possible delayed PostgreSQL mayhem ensues.

This being JavaScript, I'm sure it's possible to monkey-patch that desired syntax in. This makes me slightly uneasy, but I think it's worth looking at.

A super big hammer alternative would be to switch from Knex + Bookshelf to Sequelize. It just seems more mature and well documented in general.

A lighter alternative is to drop Bookshelf and work with hand-rolled Knex queries instead.

mkalam-alami commented 6 years ago

I like the idea of a bookshelf plugin based on continuation-local-storage. It could simply handle automatically the passing of { transacting: transaction }. Migrating the existing code would be simple.

...There's actually an existing implementation, it's worth giving a try: pldin601/bookshelf-cls-transaction

ttencate commented 6 years ago

Awesome! Sounds like we can have our cake and eat it too :)

ttencate commented 6 years ago

As an experiment, I tried removing all transactioning code from the services, and wrap the entire samples insertion in a single transaction using bookshelf-cls-transaction. Yet I'm still getting this Timeout acquiring a connection error. I suspect that it's because we're doing raw knex stuff as well, and the plugin clearly doesn't handle that.

So I suppose we need an equivalent plugin or monkeypatch for Knex instead...

Edit: Or add the transaction manually to those cases. It's not officially exposed by bookshelf-cls-transaction, but we can find it on the CLS namespace bookshelf-sessions.

mkalam-alami commented 5 years ago

Following a Bookshelf upgrade, we encountered some 500 errors due to Bookshelf. The db.transaction((t) => ...) now requires the callback to return a promise that resolves when ALL the queries using it have completed.

If we don't it now closes the transaction and triggers a Transaction query already complete failure if some query still has to execute. That's something to be aware of if we implement this issue someday.

(Note: the fix was 159b7d25, we're now adding awaits everywhere. Not sure if that behaves differently performance-wise compared to trying to parallelize queries more)