TimWSpence / cats-stm

A STM implementation for Cats Effect
https://timwspence.github.io/cats-stm/
Apache License 2.0
143 stars 17 forks source link

(Lack of) opacity #393

Open durban opened 2 years ago

durban commented 2 years ago

Opacity is a correctness property for STM algorithms. Here is a short description and pointers to more reading: https://nbronson.github.io/scala-stm/semantics.html#opacity. The really short version is that opacity guarantees that a running transaction will never read inconsistent values from transactional varables, even if that transaction will never commit or fail.

It seems that currently cats-stm does not provide opacity. We can see this with the following example: let's have two TVar[Int]s, r1 and r2, with the invariant that r1 < r2. One transaction increments the values (transactionally), upholding this invariant:

  def txn1(r1: TVar[Int], r2: TVar[Int]): Txn[Unit] = {
    r1.modify(_ + 1) *> r2.modify(_ + 1)
  }

Another transaction reads from both TVars, and compares the values; if it reads impossible (inconsistent) values, it prints ERROR: ...:

  def txn2(r1: TVar[Int], r2: TVar[Int]): Txn[Int] = {
    r1.get.flatMap { v1 =>
      r2.get.map { v2 =>
        if (v2 <= v1) {
          println(s"ERROR: (${v1}, ${v2})")
        }
        v2 - v1
      }
    }
  }

Sometimes when txn1 and txn2 run concurrently, txn2 can observe the impossible (inconsistent) values (see below for a full reproduction). Such running instances of txn2 seem to not commit/fail (they seem to be retried). However, observing such inconsistency in a running transaction can still be problematic; for example, we could have while (true) {} instead of println(...), which would cause the transaction to never commit or fail.

Full code example:

package com.example

import cats.effect.kernel.Concurrent
import cats.effect.IO
import cats.syntax.all._

import io.github.timwspence.cats.stm.STM

import munit.CatsEffectSuite

class StmExample[F[_]](stm: STM[F])(implicit F: Concurrent[F]) {

  import stm._

  private[this] def txn1(r1: TVar[Int], r2: TVar[Int]): Txn[Unit] = {
    r1.modify(_ + 1) *> r2.modify(_ + 1)
  }

  private[this] def txn2(r1: TVar[Int], r2: TVar[Int]): Txn[Int] = {
    r1.get.flatMap { v1 =>
      r2.get.map { v2 =>
        if (v2 <= v1) {
          println(s"ERROR: (${v1}, ${v2})")
        }
        v2 - v1
      }
    }
  }

  def example: F[Int] = for {
    r1 <- stm.commit(TVar.of(0))
    r2 <- stm.commit(TVar.of(1))
    t1 = stm.commit(txn1(r1, r2))
    t2 = F.parReplicateAN(32)(
      replicas = 64,
      ma = stm.commit(txn2(r1, r2)).flatTap {
        case 1 => F.unit
        case x => F.raiseError[Unit](new Exception(s"result was $x"))
      }
    )
    b = F.both(F.cede *> t1, F.cede *> t2)
    r <- b.replicateA(8192).map(_.last._2)
  } yield r.last
}

final class CatsStmTest extends CatsEffectSuite {

  test("Test STM") {
    STM.Make[IO].runtime(Long.MaxValue).flatMap { stm =>
      assertIO(new StmExample[IO](stm).example, 1)
    }
  }
}

Running this test results in output like this:

ERROR: (1388, 1388)
ERROR: (1388, 1388)
ERROR: (1388, 1388)
ERROR: (2205, 2205)
ERROR: (2205, 2205)

(The number of ERROR lines is non-deterministic, sometimes even zero.)

TimWSpence commented 2 years ago

Sorry @durban just saw this! Thanks for raising it! This definitely should not happen. I'll look into it ASAP

TimWSpence commented 2 years ago

@durban sorry, I understand this now. This is currently by design (we allow potentially stale reads but then retry the transaction upon commit). AFAICT Haskell's current implementation does this as well. A cursory skim of the literature suggests that there is probably a reasonably significant performance tradeoff here.

The cats-stm docs should definitely make this clear though and should also spell out more clearly though that side effects (eg println above) are not supported in transactions, as they may be retried a non-deterministic number of times.

If you're happy then I suggest we update the docs and close this issue as working as intended, but we could also open another issue to investigate the feasibility and performance of an alternative implementation based on TL2 or similar?

durban commented 2 years ago

Well ... documenting it and doing nothing is definitely an option. I don't think it's a very good one, because:

Having said that, if you do not want to provide opacity, that's definitely your decision.

TimWSpence commented 2 years ago

@durban so sorry I just got round to looking at this again.

I don't think I agree that this can result in invariants being violated (although it certainly can result in a different branch of an orElse being chosen based on a stale read). Can you give an example of what you are concerned about? Again I think that the problems are related to side effects allowing you to observe a transaction in an inconsistent state. But that transaction will be retried rather than committed and hence no invariants are broken.

The performance is interesting. I should definitely have a look at what scala-stm is doing some time. I can definitely believe that it is faster as I have spent approximately zero time optimizing this 😅

Thanks for raising this issue! It's a very interesting one

durban commented 2 years ago

@TimWSpence I'm talking about the fact, that while we have the invariant that r1 < r2 (and when writing, we uphold this invariant), we still can observe a case in a running transaction when r1 >= r2. It's (probably) true, that such a transaction will never commit, but preferably transactions would commit after a finite number of steps. Here is a full example, where a transaction never commits due to reading "impossible" values: https://github.com/zio/zio/issues/6324 (the example is for ZSTM, but I think the same principle applies to cats-stm).

I was also thinking about that the JVM actually saves us from the most terrible things (e.g., it guarantees memory safety). The worst things I could come up are: