plone / guillotina

Python AsyncIO data API to manage billions of resources
https://guillotina.readthedocs.io/en/latest/
Other
187 stars 50 forks source link

Review current conflicts resolution logic #1159

Open masipcat opened 2 years ago

masipcat commented 2 years ago

I was reviewing this part and I don't understand why the strategy resolve_readcommitted (the default one) needs the tpc_vote logic from resolve strategy.

If I'm not wrong, these are the cases we are trying to mitigate:

  1. Cache invalidation race condition: an object is modified but the next request that wants to modify the object reads the stale object. During the write to the storage, we want to detect that the old txn id is not the latest in postgres. This is already implemented in the UPDATE sql, using the WHERE tid = $7::bigint. If the WHERE clause doesn't match it would not update the object and we'll raise the ConflictError.

  2. Two concurrent requests modify the same object "at the same time". At the same time could mean two things: a. Both requests read the same object, the first starts the pg transaction and writes the changes and finished. When the second transaction starts will try to update the object and I'd would happen the error in point 1. The Old TID doesn't match and the ConflictError is raised. b. Both requets read the same object, the first request starts the transaction and writes the changes. At the same time, before committing the pg transaction the second request starts a transaction and tries to write the same object. In that case two things could happen:

    1. same thing as previous points: TID mismatch -> Conflict.
    2. deadlock and both requests fail

My conclusion is that in all these situations postgres/SQL queries already handle the conflicts, and there is no need for the extra logic. The only case I could think this is useful is for Cockroach DB.

./cc @bloodbare @vangheem

_Originally posted by @masipcat in https://github.com/plone/guillotina/pull/1154#discussion_r748486685_