Comments about "Limitations and Alternatives"

A few comments about the "Limitations and Alternatives" section:

Model checkers exhaustively enumerate executions looking for errors, but they're not proving anything about the correctness of your specification.

For the (finite) model, TLC actually checks the correctness of the specification for the stated safety and liveness properties. For simple algorithms - such as the one below (taken from here) - TLC can even check the complete algorithm, not just a model of it.

EXTENDS Integers
(*****
--algorithm SimpleMutex {
  variables flag = [i \in {0, 1} |-> FALSE] ;
    process (proc \in {0,1}) {
         s1: while (TRUE) {
                  flag[self] := TRUE ;
         s2:
                  await flag[1-self] = FALSE ;
         cs:
                  skip ;
         s3:
                  flag[self] := FALSE
    }
  }
}
*****)

If one needs higher assurances for infinite models, TLA+ comes with a proof system to write machine checked proofs. Usually people combine the powers of TLC and TLAPS (see Proving Safety Properties and Using TLC to Check Inductive Invariance).

If there are infinite executions, you cannot check all of them.

Infinite executions do not imply that the state space is infinite. By definition a behavior is always an infinite sequence of states (a terminating execution has a suffix of infinite stuttering). TLC can easily check specs that allow infinitely many behaviors:

VARIABLE x
Spec == x = 0 /\ [][x' \in {0,1}]_x

TLC checks this specification in less than a second. Do not replace {0,1} with Nat though! ;-)

Need to resort to automated theorem proving for really being confident that code is bug free.

True, even better if the implementation is correct-by-construction. In reality though, even machine checked proofs do not provide absolute confidence. Anyway, engineers are usually happy with less. SPIN for example supports model-driven verification which apparently is good enough for NASA.

TLC can be slow, really slow

I ran TLC on my BPaxos stuff on 64 cores for hours and I exhausted 16 GB of disk before it could finish. Could take days to simulate.

In comparison, how long did it take you to find the inductive invariant and write a machine checked proof (compare "Proof assistants are addictive and a huge time sink.")?

By the way, can you share your BPaxos spec to study it as part of our scalability tests?

Also, apalache might in the future scale to larger models.

Sometimes hard to check if your system is doing what you expect.

I'm inclined to argue that this problem is not unique for TLA+. ;-)

mwhittaker / tla_talk

Comments about "Limitations and Alternatives" #1