JetBrains / Exposed

Kotlin SQL Framework
http://jetbrains.github.io/Exposed/
Apache License 2.0
8.37k stars 694 forks source link

`Select For Update` returns cached entity instead of updated entity. #1527

Open yangineer opened 2 years ago

yangineer commented 2 years ago

In Postgres where the transaction is in Read Committed Isolation Level, when Select For Update, the client should receive the updated version of the row if the row is locked.

In an Exposed transaction with Read Committed Isolation Level, when I query for an entity using DAO, and then do a select for update for the same entity later in the transaction, the select for update would be blocked until a concurrent transaction that updates the locked row is committed, as expected. However, select for update also returned the cached entity, instead of the updated version as I was expecting. Calling additional select for the same entity returned cached results as well. Only when calling .reload(entity) would fetch the updated version.

Shouldn't select for update skip the entity cache?

Also, in Read Committed isolation level, I would expect:

two successive SELECT commands can see different data, even though they are within a single transaction, if other transactions commit changes after the first SELECT starts and before the second SELECT starts.

However, with the entity cache, I would see the same data(ie. cached data) if I query for the same entity in a transaction. Is that correct?

nikunjy commented 1 year ago

Do you think this issue can be prioritized @Tapac
Would you accept a PR ?

yangineer commented 1 year ago

Ok, the above is happening because of wrapRow() in the select for update.

Ie. It returns the cached entity. Same root issue as: https://github.com/JetBrains/Exposed/issues/653

e5l commented 1 year ago

Hey @yangineer, @nikunjy. Thanks for the highlight and sorry for the delay.

I'm looking into the issue now and if you have any ideas how to fix it, please share it.

We're also open to PRs :)

yangineer commented 1 year ago

Thank you @e5l for looking into this.

Btw, I have a draft PR with a test to reproduce this bug in case anyone wish to take a stab.

My idea is that we should update wrapRow to not return cached values. Ie. If the entity is in the entity cache, then wrapRow should update that cache with the ResultRow (maybe as optimization, update only if it is different), and return the new entity.

@Tapac previously mentioned an edge case about what to do if the entity is dirty. (ie. has pending updates)

Then Exposed won't be able to track changes on entities and flush them and the same for the referencies.

How would you expect Exposed to behave in a case when you modified your current entity and then load them with wrap? Should an entity updated before select or all changes should be lost?

I don't think that's a problem in most cases because wrapRow is typically used to convert ResultRow from a DSL query, right after the DSL query executes. The DSL query would have flushed any changes to the entity from the beforeExecution hooks. It is unlikely that a user would update an entity while wrapRowing it.

If it does happen, I think we should throw an error. Ex. "Entity has pending changes and can't be wrapRowed". Ex. If there are duplicate rows(ie. rows with the same entityID) in the results, and we modify the entity as we wrapRow it

val updateEntities = query.map {
  val entity = Entity.wrapRow(it)
  entity.value = 'newValue'
  entity
}

Do you think this approach makes sense?