pkp / pkp-lib

The library used by PKP's applications OJS, OMP and OPS, open source software for scholarly publishing.
https://pkp.sfu.ca
GNU General Public License v3.0
304 stars 444 forks source link

Standardize transaction isolation #10275

Open jonasraoni opened 2 months ago

jonasraoni commented 2 months ago

Describe the bug Laravel doesn't setup a default transaction isolation, then things might behave differently depending on the default isolation set at the database configuration.

The default transaction isolation of MySQL is repeatable read, while PostgreSQL uses read committed. A sysadmin can also change the defaults, even though it's unlikely to happen, we shouldn't trust on it.

We should go with the read committed, it represents better what's happening at the database, is more performatic (the database won't have to manage too much data/issue extra locks), and by reducing the number of locks, the occurrence of deadlocks should also decrease.

The difference lies on the possibility of phantom reads, which is ok!

What application are you using? OJS 3.3

jonasraoni commented 2 months ago

While I was writing a discussion, I've noticed the isolation isn't stable. I'll update just the main branch. Anything against @asmecher?

asmecher commented 2 months ago

I'm ambivalent... I would prefer we hewed close to the defaults in general, and I don't think this solves a problem anyone's experienced.

jonasraoni commented 1 month ago

I think we need to enforce it for the sake of predictability. In theory changing the default of MySQL should improve the performance when under higher concurrency. The market standard is to use the read committed isolation (MySQL is an outlier).

In one of the discussions, I proposed an idea that would need a specific isolation to work properly (otherwise the idea would work fine in PostgreSQL, but not in MySQL).

asmecher commented 1 month ago

@jonasraoni, what's the feature that requires isolation?

jonasraoni commented 1 month ago

Ah, I've left a note on the "Drawbacks" of https://github.com/pkp/pkp-lib/discussions/10274#:~:text=on%20your%20installation%22).-,Drawbacks,-Linking%20to%20a. If the system is under a transaction, and you want to have another independent operation(s), then you need a separate transaction/connection. And if you want both to see each other committed changes, a read committed is needed.

IMHO the behavior of repeatable read produces a fake reality (the data is on the database, already committed, but you may or may not see it, depending on your previous queries) and it has extra costs/locks.

The other point that I care, is to have an environment as standard as possible. So, we should either use read committed or repeatable read, but not leave it to fate.

asmecher commented 1 month ago

We haven't started using transactions yet (and we're not ready to start just yet -- there's a relevant discussion in the Development channel in Mattermost from 2024-07-31) and https://github.com/pkp/pkp-lib/discussions/10274 isn't on the roadmap yet, so I don't think this change is motivated just yet. I am motivated by a reduction in locks, but that won't materialize until we start using transactions. Could you turn this into a discussion and scope it up to be a general capture on transactions?