Make sure the HTTP call is never made inside a database transaction.

GeoffreyHuck commented 3 weeks ago

fixes #1081

Because we want to have fast transaction.

Now, we call the endpoint at the end of the current transaction.

Tricky part: How to pass the config endpoint to the `DataStore`

That was the trickiest part.

The first question is: Do DataStore need to know about how to schedule the propagation? I choose that the answer was yes because ~~it was easier~~ the fact we do it after a transaction means it makes sense to be aware of transactions.

Also, InTransaction is called in many places, and it doesn't make sense at all to pass the endpoint as a parameter to this function, because most transactions are not about propagations at all.

What was done: the request Context was already passed to the DataStore, but it wasn't used. So the endpoint is now stored in the request Context, in a middleware, and retrieved from the DataStore. It's the easiest way I found.

Alternatives

Make the configuration available globally. It's always tempting, but if we start like this we might end up with more and more globals.
Have another method called InTransactionWithPropagation with the endpoint as parameter, that does the same as InTransaction but with a propagation. The downside is that we would have to know which one to use at the start of the transaction. It seems to me that the propagation is a notion that appears at a lower level than the service.

Make sure the propagation types are unique when we call the endpoint

We also make the types of propagation, sent as a parameter when we call the propagation endpoint, unique. We don't have to, it would work without it, but it seems cleaner.

Review

Easier to review commit by commit. Details in the commit messages.

codecov[bot] commented 3 weeks ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 100.00%. Comparing base (4d1012c) to head (dd140bf). Report is 22 commits behind head on master.

Additional details and impacted files

```diff @@ Coverage Diff @@ ## master #1099 +/- ## ========================================= Coverage 100.00% 100.00% ========================================= Files 239 239 Lines 14462 14480 +18 ========================================= + Hits 14462 14480 +18 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

smadbe commented 3 weeks ago

I would prefer that @zenovich reviews this one.

zenovich commented 3 weeks ago

The first question is: Do DataStore need to know about how to schedule the propagation? I choose that the answer was yes because it was easier the fact we do it after a transaction means it makes sense to be aware of transactions.

I still believe that the answer is "no" as it was discussed days before on Slack. The reason is that DataStore is just a clever DB interface decorator. But probably we could rethink the meaning of DataStore.

And actually I need to study this a bit more from scratch, as, from my perspective (from what it looked like two years ago), mutating operations on the DB resulted in setting the DB into inconsistent state (because of the operations themselves and also because of MySQL triggers), the propagations fixed that by bringing the DB into the consistent state again. That was the reason of calling the propagations in the same transaction with the mutating operations.

So, first I would like to study the current situation.

@smadbe, is it an urgent task?

zenovich commented 2 weeks ago

The main thing: If we schedule a propagation outside of a transaction, we cannot guarantee that data modifications already committed to the DB will be followed by a corresponding scheduled propagation. Indeed, let's consider a situation where, for instance, an AWS Lamba function processing a user request gets killed because of a timeout right after it has committed the data modifications to the DB, but before it has called another AWS Lambda function to trigger the propagation.

Such situations are really dangerous and hard to debug. They will definitely lead to production failures and DB inconsistency issues.

I would think 100500 times before doing this and finally would find another approach.

France-ioi / AlgoreaBackend