laravel / ideas

Issues board used for Laravel internals discussions.
939 stars 28 forks source link

Implement new trait that enforces version check upon saving modifications to a model #1493

Open denjaland opened 5 years ago

denjaland commented 5 years ago

Hi,

It's a typical use case, and I know how to solve it, but frankly, I have no idea how I can solve it in Laravel without having to hack too often. I've also asked in chatd and on stackoverflow, but there is no good solution for the following in my opinion.

Let's consider you do the following:

$model1 = App\SomeModel::find(1);
$model2 = App\SomeModel::find(1);

$model1->some_attribute = 'new value 1';
$model1->save();

$model2->some_attribute = 'new value 2';
$model2->save();

The save of $model2 is currently being accepted by Eloquent, but it is actually updating a version of the model that is in fact no longer valid, because the model was already updated by the save on $model1. (these objects fetched the same model).

I can hear you think yeah, but why would you ever do that? Well basically - as soon as you have two processes that can work on the same model, you run this risk. No matter whether these two processes are triggered by two users, or by jobs that have been queued, or... For an enterprise application, it would be unacceptable to allow $model2 to be saved, as it saves information based on an OLD version of the model.

Let's make a concrete example. Consider that we have a system where customers can order items on the website, but we don't want to send out an invoice everytime something is ordered. We might solve it by creating a model Invoice and another model InvoiceItem which are both linked to the customer through a customer_id.

When a customer orders a product or service, we are creating an invoice item but leave _invoiceid NULL. That way we know that every month, we just collect all the items where invoice_id IS NULL, we create a new Invoice for them, and update the invoice_id on the items. Makes perfect sense, right?

The way I usually solve something like this, is that I have a Scheduled Job that basically fetches all customers that have uninvoiced invoice items, and queue a job to invoice that customer. It may be a bad example, but let's say that that Scheduled job runs every hour or so.

If you have a LOT of data, it may be that by the time you run that schedule again, there might in fact already by a job for that particular customer, so it's perfectly possible that you are queueing a second job for that same customer.

There is no way that you can avoid that.

Anyway - it's just an example, but consider that these two jobs are being processed by parallel processes at the exact same time; you do not want the second process being able to process these invoice items when it has meanwhile already been linked to an invoice, because that would really screw up your data.

This is a common use case in many enterprise applications, and is easily solved, but it seems a bit hard to achieve in Laravel without hacking too much, hence the request to consider making this a framework feature.

All we need is a new attribute on any model, $version and check whether it's still consisten when we save it.

So basically, if we were to include the new trait requiresSameVersionUpdate, the $model->performUpdate() should do the following:

UPDATE table
SET version = version + 1, attr1 = value1, ...
WHERE version = $model->version 
AND id = $model->id

If the database returns 'no records updated', check if there is a record for the given ID, if so, we know the ersions did not match, and we should raise the exception 'ModelAlreadyUpdatedByOtherProcess`

After the successful update, make sure to set $model->version also to the new version, so we can make subsequent modifications, and saves.

I'm sure many of the readers here have ran into the same issues, and I simply haven't found an elegant solution in Laravel to implement this.

Working liks this is the only 'good' way to make sure that what you are updating corresponds to what the database (the only truth) currently holds; if the versions no longer match, it means another process already updated the model after you fetched it from the database. Since you current version is no longer actual, the changes you made might no longer apply; hence we want the exception to be thrown allowing us to ROLLBACK any transaction we were running.

I hope this makes sense.

For your reference, here is my SO question ;https://stackoverflow.com/questions/54260692/best-way-to-prevent-simultaneous-model-mutations-in-laravel

martinbean commented 5 years ago

@denjaland Sounds like a nice use case for a Versionable trait. It’s implementation could be as simple as adding the updated_at timestamp as a where clause, so only a matching row is updated. If no rows are touched, it could be assumed that the row was out of date.

The issues I see are:

If these points could be addressed, then I could quite easily see a naïve implementation make its way into the framework.

denjaland commented 5 years ago

Updated_at would be a nice starter, but I would prefer to have a seperate increasing attribute. i’ve seen bugs in banking systems because of simultaneous updates at the same timestamp, so the separate version would be a lot better.

I agree we need a separate; I’ve also been looking at the implementation, and as you said, updated_at is set in code and then saved, so I guess I’m a bit lost as to where I could plug this in.

I believe it would be a nice feature for “Laravel Enterprise”; if you build enterprise apps, you can’t live without such feature imo :-)

martinbean commented 5 years ago

@denjaland Using the updated_at timestamp would just be a simple, naïve solution. If the functionality is offered as a trait, then the relevant methods could be override to use a different column/strategy.

This can also be done at the client level. A client could GET a resource (and its ETag), then re-submit that ETag value when updating a resource. If the resource has a different ETag, then the client should receive an error and can assume the resource was modified in between fetching and updating that resource.

sisve commented 5 years ago

This is called optimistic locking. The optimistic part comes from the belief that everyone that interacts with the database will honor the locking system in place and promise to play nice. (There's also pessimistic locking that uses actual database locks.)

There are already packages that provides this in a trait form.

https://github.com/reshadman/laravel-optimistic-locking

denjaland commented 5 years ago

@sisve I just checked that package / trait, and that is exactly what I wanted to do. Thanks for that link. The only concern I have with that, and that is mainly the reason for raising the question everywhere, is that it feels like it's a real hack of Eloquent; they are overriding the performUpdatemethod, which in fact the only way that I saw possible as well...

I always try to avoid that to avoid risking issues when upgrading laravel framework, but yes - that is exactly what I was looking for, so maybe I shouldn't be that worried about framework upgrades...

deleugpn commented 5 years ago

Im sorry, maybe I didn't understand correctly, but I saw no reason to reach for a complex locking system. Your scheduler can easily perform an update on all invoice items that has been scheduled so they don't get picked up by the next scheduler. Just add a nullable scheduled_at column at your scheduler and you'll be able to distinguish what to update and what not to update.

denjaland commented 5 years ago

You didn't understand. You can have two processes runnig simultaneously. Or you could have a user manually process the invoices while your job is running, or... There are so many use cases where this is by definition going to go wrong...

Namoshek commented 5 years ago

@denjaland I agree that you shouldn't necessarily override framework functions if it is avoidable. In my opinion, this is one of the situations where it can be avoided but only with some downsides.

The suggested package adds a where clause to the update query which ensures we only update the record if it is still the version we retrieved. This seems like a good approach to me because it only yields one query.

An alternative is to use the Eloquent saving event to perform the version check independently. But this requires locking of the table and adds a second query to retrieve the current version. So from a performance perspective, it is worse. In a scenario where only users could update a resource concurrently, it seems fine though (because the amount of updates is quite low anyway). Conceptually, I'd still go with the suggested approach.

denjaland commented 5 years ago

@Namoshek yeah - I'm torn between choosing for a "potential performance hit" and a "potential bug with any new laravel release" which is basically my primary reason to raise it as an "idea" to be implemented withing the laravel eloquent core itself. To me, this is vital logic that any framework should - somehow - support.

I agree though that I will be implementing it very similar to the mentionned package, which is in fact what I came up with as well.

Thanks for your feedback! And still crossing my fingers it'll one day make it into the laravel core ;-)