Open durban opened 2 years ago
Sorry @durban just saw this! Thanks for raising it! This definitely should not happen. I'll look into it ASAP
@durban sorry, I understand this now. This is currently by design (we allow potentially stale reads but then retry the transaction upon commit). AFAICT Haskell's current implementation does this as well. A cursory skim of the literature suggests that there is probably a reasonably significant performance tradeoff here.
The cats-stm docs should definitely make this clear though and should also spell out more clearly though that side effects (eg println
above) are not supported in transactions, as they may be retried a non-deterministic number of times.
If you're happy then I suggest we update the docs and close this issue as working as intended, but we could also open another issue to investigate the feasibility and performance of an alternative implementation based on TL2 or similar?
Well ... documenting it and doing nothing is definitely an option. I don't think it's a very good one, because:
println
in the example is just to make it very clear what happens.scala-stm
has opacity, but seems faster than cats-stm
, based on my very limited benchmarks. But that could be simply due to the monadic API... I don't know.)Having said that, if you do not want to provide opacity, that's definitely your decision.
@durban so sorry I just got round to looking at this again.
I don't think I agree that this can result in invariants being violated (although it certainly can result in a different branch of an orElse
being chosen based on a stale read). Can you give an example of what you are concerned about? Again I think that the problems are related to side effects allowing you to observe a transaction in an inconsistent state. But that transaction will be retried rather than committed and hence no invariants are broken.
The performance is interesting. I should definitely have a look at what scala-stm
is doing some time. I can definitely believe that it is faster as I have spent approximately zero time optimizing this 😅
Thanks for raising this issue! It's a very interesting one
@TimWSpence I'm talking about the fact, that while we have the invariant that r1 < r2
(and when writing, we uphold this invariant), we still can observe a case in a running transaction when r1 >= r2
. It's (probably) true, that such a transaction will never commit, but preferably transactions would commit after a finite number of steps. Here is a full example, where a transaction never commits due to reading "impossible" values: https://github.com/zio/zio/issues/6324 (the example is for ZSTM, but I think the same principle applies to cats-stm).
I was also thinking about that the JVM actually saves us from the most terrible things (e.g., it guarantees memory safety). The worst things I could come up are:
Opacity is a correctness property for STM algorithms. Here is a short description and pointers to more reading: https://nbronson.github.io/scala-stm/semantics.html#opacity. The really short version is that opacity guarantees that a running transaction will never read inconsistent values from transactional varables, even if that transaction will never commit or fail.
It seems that currently
cats-stm
does not provide opacity. We can see this with the following example: let's have twoTVar[Int]
s,r1
andr2
, with the invariant thatr1 < r2
. One transaction increments the values (transactionally), upholding this invariant:Another transaction reads from both
TVar
s, and compares the values; if it reads impossible (inconsistent) values, it printsERROR: ...
:Sometimes when
txn1
andtxn2
run concurrently,txn2
can observe the impossible (inconsistent) values (see below for a full reproduction). Such running instances oftxn2
seem to not commit/fail (they seem to be retried). However, observing such inconsistency in a running transaction can still be problematic; for example, we could havewhile (true) {}
instead ofprintln(...)
, which would cause the transaction to never commit or fail.Full code example:
Running this test results in output like this:
(The number of
ERROR
lines is non-deterministic, sometimes even zero.)