postmanlabs / sails-mysql-transactions

sails/waterline ORM with mySQL transaction support
Apache License 2.0
59 stars 20 forks source link

[reflection] [idea] #26

Open mikermcneil opened 8 years ago

mikermcneil commented 8 years ago

First of all, this module is awesome; amazing work (particularly the depth you went to w/ making instance methods do their thing when amidst a transaction).

I wanted to leave a note here with some reflections on the history and future of transactions in WL core, and also share an idea for some syntactic sugar based on what I learned from WL2.

Transactions in WL core

Bad news first: We will probably never have true atomic, durable, consistent, and isolated cross-adapter transactions in Waterline core.

Aside: I've always dreamed of having cross-adapter transactions in WL core. I spent a bunch of time messing with this waaaaaay back when I was first building Waterline. In fact, that was almost exactly 3 years ago to the day (Christmas 2012). As it turns out, trying to implement cross-adapter transactions is a nightmare and basically like writing your own database. Nevertheless, for some reason I spent the time to get the "CI" in "ACID" working-- but I never got around to implementing inverse operations in order to be able to roll back-- which we would need to form a truly crash-resistant (read "durable") commit log). Long story short: it is an interesting exercise, but ultimately a bad idea.

The good news is that we will definitely implement transactions across the same connection to the same database. The question is just when-- and you guys have done an amazing job hastening that along.

An idea re: syntax

The #1 thing I've always been worried about w/ transactions is forgetting to rollback under rare error conditions (i.e. that never get tested). The #2 thing has always been forgetting to use the proper transaction instance and accidentally using models directly.

When I was actively working on WL2 last summer, the simplest possible usage I could come up with which also addressed both issues (at least to some degree) was something like this:

orm.transaction([User, Kitten], function (User, Kitten, cb) {
  Kitten.create({ name: req.param('kittenName') }).exec(function (err, babyKitten){
    if (err) {
      return cb(err); // << that means rollback
    }
    User.update({ id: req.session.userId }).set({ petKitten: babyKitten.id }).exec(function (err, updatedUsers) {
      if (err)  {
        return cb(err);  // << that means rollback
      }
      if (updatedUsers.length === 0) {
        err = new Error('Consistency violation: Could no longer find currently logged in user in database!');
        err.code = 'loggedInUserMissingFromDb';
        return cb(err);  // << that means rollback
      }
      return cb(undefined, updatedUsers[0]);  // << that means commit  (so would `cb()`)
      // (second arg is optional, but if you pass it in, it'll show up as the 2nd argument
      //  of the callback below)
    });
  });
}).exec(function (err, data) {
  if (err) {
    // if `err` is truthy, then there was an error, and the transaction was
    // already rolled back by the time we make it here.
    return res.negotiate(err);
  }

  // if not, there were no errors, and the transaction was safely committed
  // by the time we make it here!
  return res.json({
    username: data.username,
    kittenId: data.petKitten
  });
});

Below, I put together a hypothetical remapping of the example in the README of this module if it was wrapped up in something like this (in addition to .start(), not instead of it). Please take it with a grain of salt-- I'm not saying I think it should be this way in this module necessarily (it's more in-line with what we'd need in core Waterline anyway-- i.e. triggering a usage error if the models are not part of the same database connection). I just want to make sure and share this thinking with you guys in case its helpful or spurs any ideas for folding transaction support into Sails/Waterline core.

Here goes nothing:

var Transaction = require('sails-mysql-transactions').Transaction;

// Attempt a database transaction
Transaction.attempt([OneModel, AnotherModel], function (OneModel, AnotherModel, cb) {
  OneModel.create(req.params.all(), function (err, someRecord) {
    if (err) {
      return cb(err);
    }
    AnotherModel.findOne(req.param('id')).exec(function (err, anotherRecord) {
      if (err) {
        return cb(err);
      }
      someRecord.someAssociation.remove(req.param('remove_id'));
      someRecord.save(function (err, savedRecord){
        if (err) {
          return cb(err);
        }
        return cb({
          savedRecord: savedRecord
          anotherRecord: anotherRecord
        });
      });
    });
  });
}).exec(function (err, data) {
  if (err) {
    // This means that either something explicitly passed back an error in the provided function above,
    // or the function never ran because a transaction could not be started, or that the transaction was
    // otherwise successful but could not be committed. If we're here, we know the transaction has
    // been rolled back successfully.
    return res.serverError(err);
  }

  // Otherwise, we know everything worked and that the transaction was committed.
  return res.json({
    one: data.savedModel,
    another: data.anotherInstance
  });
});
shamasis commented 8 years ago

Really interesting thoughts. Personally, I am glad that we are finally discussing on these. We could really use your help and insight into taking this further forward.

These are a few thoughts from our notes:

  1. One use case that has been cropping up (the reason for making transaction as a reference object in our case) - doing transactions across multiple services and model functions. Otherwise, monolithic transactional operations become unmanageable.
  2. Having the knowledge of all models while starting a transactions inhibits reusability of code and at times, is not feasible if the decision to choose model depends on one of the db results within the same transaction.

Like you mentioned, there is a chance with the present API that developer forgets to commit/rollback. The only counter we have for this is auto-rollback after a time interval. :-/

PS: I am putting some time to implement this new syntax. I love the fact that the proposed syntaxt totally abstracts the concept if commit and rollback!!

shamasis commented 8 years ago

Thoughts on our present implementation of transaction.

The majority of transaction logic is written within this adapter. We wanted to have the API handled at the waterline level by adding additional methods for adapter interface. However, that became very tedious as it was super difficult to carry forward any information from userland to the adapterland due to the number of deep clones that happen during the information flow.

shamasis commented 8 years ago

Things we intend to do next:

  1. Support select query all over the place. This is becoming challenging with my limited understanding of waterline internals.
  2. Adding .countOf to procure the count of associations without making select queries.
  3. Adding support for transactions across multiple sources. This we are doing by keeping track of transaction per source level and when one transaction fails, rollback all others and once one is committed, commit all others. (downside, all transactions do not start at the same time, nevertheless, the intended behaviour is achieved.)
mikermcneil commented 8 years ago

Sam thanks for the detailed follow up. When you say "across multiple sources", are you taking about the same db?

e.g.

  1. Let's say you have models called Farmer, Farm, and Tree in a MySQL database called "farming" on a db server in AWS.
  2. You also have models called Wookie and Spaceship in a MySQL database called "starfleet" on the same db server.
  3. You have models called Picture and Gallery in a MySQL database called "media" running on a different db server in a customer's data center. Just for kicks, let's say this db server is running a different version of MySQL.
  4. You have models called Architect and Building in a Postgres db called "construction" living on a separate database server hosted in a different region on AWS.
  5. Finally, you have models called Activity, PersonalMessage, and UserDataExportManifest living in a mongo database called "misc" on a hosted MongoLabs server.

Here's what I think we shouldn't try to solve in core right now (based on my experience trying the last time): A. any transactions involving any models in #5 (since native transactions are not supported in Mongo, so it would involve implementing a commit log) B. any transactions involving models in #4 AND models in #1, #2, or #3 (since transactions across database vendors, so it would involve implementing a commit log) C. any transactions involving models in #3 AND models in #1 or #2 (since transactions across database servers and versions, even from the same vendor, have spotty or non-existent support, so it would involve implementing a commit log) D. any transactions involving models in #1 AND models in #2 (but I could be convinced otherwise on this one-- its just that the way different databases natively implement cross-db transactions differs a bit by vendor, so it'd be cleaner to avoid it at the beginning)

( As you can see, what I'm getting at is that we don't want to have to roll a durable commit log :p )

Those caveats aside, I think it would totally be doable to support the following: W. any transactions involving one or more models in #1 X. any transactions involving one or more models in #2 Y. any transactions involving one or more models in #3 Z. any transactions involving one or more models in #4

You guys have gotten a good start on W, X, and Y in this module. To get to Z, we'll want to define standard functions (e.g. commit, rollback, and startTransaction) at the adapter level.

(cc @particlebanana)

mikermcneil commented 8 years ago

(Oops, didn't mean to close was trying to type in my phone sorry)

shamasis commented 8 years ago

How about one or more models between #1 #2 and #3? Since its same vendor and multiple servers, all we need to track is end commit/rollback status and not depend on vendor's native capabilities.

mikermcneil commented 8 years ago

@shamasis please, tell me more

mikermcneil commented 8 years ago

I think I see what you're saying but to clarify:

  1. begin transaction on db1
  2. Do a query
  3. begin transaction on db2
  4. Do a query
  5. If something looks wrong, rollback all.
  6. Do more queries
  7. Finally at the end, if it all worked, commit all

That totally makes sense, hadn't thought of it. Great idea! I think that even means we could do B, C, and D (since PostgreSQL supports transactions)

mikermcneil commented 8 years ago

The only issue I see is that I don't think the actual commit calls would be truly atomic-- eg if db1 committed but db2 didn't. In other words an unsuccessful commit would mean inconsistent data, whereas in a single-db setup it just means the db thinks that particular transaction never happened. I think we'd just need to be clear about that in docs, and it would be good

shamasis commented 8 years ago

Yes. I agree to your commit failure related inconsistency. I was of the opinion that it is a very rare case.

mikermcneil commented 8 years ago

I think so too.

shamasis commented 8 years ago

@mikermcneil - in our 0.7.0 we bettered support for select in waterline for root context criteria as well as strategy 1 and 2 joins. The changes are in waterline defer.js as well as waterline-sequel.

https://github.com/postmanlabs/waterline-sequel/commit/ce232382c803f1dacead0cc0bb796019dc59e25e https://github.com/postmanlabs/waterline-sequel/commit/3be750d6b1290927c55051bd1e374393034a436f

Would it be an appropriate PR to waterline?

mikermcneil commented 8 years ago

I think so- let me loop in @particlebanana on that

mikermcneil commented 8 years ago

Only issue I'd see is that we'd need to clarify mongo support in the docs, but seeing as how projections aren't in the docs now anyways.. :p

shamasis commented 8 years ago

@particlebanana - what's your thought?

particlebanana commented 8 years ago

Hey @shamasis so the projection stuff that is there is completely by accident. I don't think anyone ever specifically added it, it works by overriding some stuff we do internally. The api is pretty nasty as well:

Model.find({ where: { age: { '<': 30 } }, select: ['name'] })

If you have stuff to bring this up to Deferred.js so it looks something like this:

User.find().select(['name', 'age']).exec(cb);

That would be cool. There is this we can target for 0.11.0 if you have anything to add in for joins.

Basically my goal is to re-write the "interchange" format between Waterline and the adapters so that it encodes the entire query rather than just the criteria. I have a start of what that may look like here. With that we can use something like Knex to build the query and pass it directly to mysql, postgresql, etc. So projections should be fully supported there. Chime in if you have anything specifically you would like to see.

mikermcneil commented 8 years ago

@shamasis we've got the beginnings of a plan hatched for this-- will follow up on Slack. In the mean time, I wanted to toss an idea for a tweak to the proposed usage out there to take into account what we talked about on Skype (not needing to know the models involved): :

// ...
foo: function (req, res){
  sails.transaction('prodMySQL', function (transaction, done){
    transaction.models.onemodel.find().exec(function(err, allRecords) {
      if (err) return done(err);
      return done(null, allRecords);
      // There would also be: `transaction.connection`
      // provided as the _actual_ raw database connection-- provided for lower-level usage
      // (but should not be closed, aborted, or committed manually-- calling `cb` does that)
    }, function (err, literallyAnything) {
     // (note that at this point, either way, we know the raw connection was released to the
     //   pool or never summoned in the first place)

      if (err) {
        // means the transaction was aborted or it never begun
        return res.negotiate(err);
      }

      // if `err` is falsy, then we know the transaction was committed. 

      // second arg is optional, but if you passed in a second arg to the callback,
      // its passed through here for convenience, e.g. so you can do
      return res.json(literallyAnything);
    });
  });
},
// ...

e.g. so basically just expose the atomic equivalent of the subset of sails.models that are associated with the specified "connection" (i.e. datastore) config. If our userland code tries to do anything naughty (e.g. specifies the identity of a Mongo datastore) then sails.transaction() triggers its callback with a pompous usage error lamenting the day it met us.

In addition to the above, summarized here:

...there'd be something more or less like the following methods:

So! The actual implementation of this is definitely one of the last steps in a multi-step project, but after talking to @particlebanana and @sgress454 about this on Friday, I'm starting to feel pretty good about how we get there. That said, before we're too far down this road, I want to check that what I posted above would be a workable interface for y'all's use case. (I realize you'd still need something above that for a ~transaction across multiple db connections, but I think, realistically given our timeline, that would need to live in a userland service for now)

shamasis commented 8 years ago

Apologies for this brief and delayed reply. I'm out of town till 27th with limited connectivity.

@particlebanana - with a specific .select in deferred.js how will one use projection within a .populate? This has come in very handy for us and we canNot imagine SailsJS in production without the projections.

Also, you are forcing the fluid style API instead of the callback model by doing this.

shamasis commented 8 years ago

@mikermcneil - this new proposal sounds super cool. Its future proof too!

mikermcneil commented 8 years ago

with a specific .select in deferred.js how will one use projection within a .populate?

That sounds doable for 1.0 to me unless Im missing something. @particlebanana I know it'd affect the operation builder (pretty sure it's covered in the integrator already, unless we ripped that out, cant remember offhand)

@shamasis awesome. We're all pumped about finally getting to a workable plan for this. Thanks for your help. Almost got that orm hook ready for you-- keep getting distracted with other documentation tasks.

sapnajindal commented 8 years ago

@shamasis I am having sails application interacting with the mysql database. I am using default ORM waterline and sails-mysql adapter. But i need mysql transactions support which sails-mysql doesnt support currently. I am planning to use sails-mysql-transactions instead of sails-mysql , Will it be a good idea to use it in prod env and is it in a stable form ? And also i have some time constraints, I wanted to know if it will be a lot of rework if i change the adapter?

shamasis commented 8 years ago

@sapnajindal - changing the adapter is simply changing the name of the adapter in connections.js, followed by updating package.json to install this adapter.

We're using this adapter in production under real heavy load. Also, note that SailsJS is also planning to introduce native transaction support in waterline very soon.

mikermcneil commented 8 years ago

Update:

// Fetches a preconfigured deferred object hooked up to the sails-mysql adapter
// (and consequently the appropriate driver)
sails.datastore('larrysDbCluster')

// Provided custom metadata is sent through (as a direct reference; i.e. no cloning)
// to ALL driver methods called as a result of the built-in transaction logic.
.meta({})

// This function will be run when the transaction begins.  If it breaks,
// the transaction will be rolled back automatically.
.transaction(function during (T, done) {
  Location.findOne({id: locationId})
    .usingConnection(T.connection)
    .exec(function (err, location) {
      if (err) {return done(err);}
      if (!location) {return done.notFound();}

      // Get all products at the location
      ProductOffering.find({location: locationId})
      .populate('productType')
      .usingConnection(T.connection)
      .exec(function(err, productOfferings) {
        if (err) {return done(err);}
        var mush = _.indexBy(productOfferings, 'id');
        return done(undefined, mush);
      });
    });
}).exec(/* … */);
mikermcneil commented 8 years ago

( btw for more on how this impacts MySQL in particular, see also https://github.com/mikermcneil/machinepack-mysql/blob/master/SYNTAX_NOTES.js )

sapnajindal commented 8 years ago

@shamasis Does this adapter create a transactionId field for all the models? And is it a mandatory field and we need to keep it in our database?

shamasis commented 8 years ago

Yes it auto creates the field. You can turn off auto creation and manually create the field by setting autoTK: false.

This is needed. Otherwise, there's now way to correlate queries that are generated during populates and other actions.

sapnajindal commented 8 years ago

@shamasis Is there any other change in sails-mysql-adapter with respect to sails-mysql ? We are currently using sails-mysql adapter, it would be great if you can share the impact of changing adapter from sails-mysql to sails-mysql-adapter.

shamasis commented 8 years ago

There are a number of changes. All of them are outlined in README. However there's no breaking change except the installation process.

sails-mysql-transactions does not work unless installed using the postinstall script. This is a limitation for which we have a fix and intend to work on it sometime next month.

sapnajindal commented 8 years ago

@shamasis I checked the Readme, which is saying we get an additional field when we do update for a model and it contains all the previous values. I am facing two problems here: -- Model.update({id:1},{name:'test'}).exec(function(err,updated,prev){})

If i run the above i get an array of json in updated updated : [ { id:1,name:'test' } ] prev : [{}]

but if i run Model.update({id:1},{name:'test'}).then(function(updated,prev){}).catch(function(err){})

updated :[ [ { id:1,name:'test' } ], [{}] ] prev : undefined

Why is this behavioural change if i use then instead of exec ? Its creating lot of complications for me.

-- and why is the prev always empty?

sapnajindal commented 8 years ago

@shamasis Is there any way i can use the transaction in models? I mean i am creating entry in one model using transactions, on the basis of this order another model is getting created, but there if some stuff which gets checked in before Create of the model.

As the transaction is not yet committed, in before create it is not able to find the linked other models id.

I want to use the same transaction in model which i started in controller. How do i do that?

sapnajindal commented 8 years ago

@shamasis I tried passing transaction object in values to the model like below

Transaction.start(function(err,transaction){
Model1.transact(transaction).create({name:'fd'},function(err, model1){
Model.transact(transaction).create({model1Id:, model1.id, transaction : transaction})
})
})

In Model.js

beforeCreate(values,callback){
Model1.transact(values.transaction).findOne({id: values.model1Id})
}

But by doing this i am getting an error in beforecreate

self._transaction.id is not a function at [object Object].util.extend.findOne (/home/vagrant/www/testProject/node_modules/sails-mysql-transactions/lib/mediator.js:139:36) at /home/vagrant/www/testProjecti/api/models/v1/Model.js:112:59 at bound (/home/vagrant/www/consumer-payments-api/node_modules/sails/node_modules/waterline/node_modules/lodash/dist/lodash.js:957:21)

shamasis commented 8 years ago

@sapnajindal - I will go through your comments in details and reply in a separate issue. This issue was not intended to be discussed what we are discussing here. The original issue creator might not be comfortable with the issue topic being hijacked.

sapnajindal commented 8 years ago

@shamasis Agreed! Hoping to get a reply soon as its blocking my work.

mikermcneil commented 7 years ago

Update! https://github.com/balderdashy/sails-docs/blob/7fbfaac9ee0d30f9e02384bf8bdee74e52b422f0/reference/waterline/datastores/transaction.md

Obviously I realize there's things you guys will need to do to be able to make the switch, but I just wanted to let you know the usage is in place and documented 👍

aptro commented 4 years ago

@shamasis is there a way we can support replica configuration for Postgres adapter.

@mikermcneil support for database replica configuration in sails/waterline's roadmap?