apalache-mc / apalache

APALACHE: symbolic model checker for TLA+ and Quint
https://apalache-mc.org/
Apache License 2.0
442 stars 40 forks source link

[FEATURE] Introduce a Take operator #623

Closed konnov closed 2 months ago

konnov commented 3 years ago

Time estimate: 1-2 days of work

When specifying consensus protocols, we often have to pick a subset of messages of fixed cardinality. For instance, here is a snippet from a safety specification of Tendermint:

       /\ LET PV == { m \in msgsPrevote[vr]: m.id = Id(v) } IN
          /\ Cardinality(PV) >= THRESHOLD2 \* line 28
          /\ evidence' = PV \union {msg} \union evidence

The problem with this code is that msgsPrevote[vr] can potentially be a large set. This is due to the possible behavior of Byzantine processes, which can send messages of all possible shapes. In our example, two Byzantine processes may produce 2 3 3 = 18 messages for just three rounds and three values (including the value Nil). If we add one more round field to the message type, e.g., validRound, then the number of potential messages grows up to 18 * 4 = 1296. As our SMT encoding is using graph overapproximations of the data structures, the overapproximation of the set PV has 1296 cells and thus produces 1296 membership SMT constraints for the set PV.

Surely, one way of dealing with this problem is to use a quantified encoding. However, this is a long-term effort planned in #377.

The above code has interesting features. The expression THRESHOLD2 is a constant that is equal to 3 in our example. So it is not clear at all, why we should carry around 1296 potential set elements, while we only care about 3 elements. A short-term solution is to introduce the operator Take that is similar to Set.take in Scala:

TakeSet(S, k) ==
   IF Cardinality(S) >= k
   THEN CHOOSE T \in SUBSET S: Cardinality(T) = k
   ELSE {}

Obviously, we don't want to use the above reference implementation. Actually, the above code would not solve the problem of the combinatorial explosion. Most likely, it will even make it worse at the SMT level. However, we can write a custom implementation that introduces k constants in the memory arena and SMT. This will dramatically reduce the number of constraints.

lemmy commented 3 years ago

The TLA+ definition of TakeSet could be added to https://github.com/tlaplus/CommunityModules/blob/57e173e41930ae4c8a3301d6a0f078717cc8d470/modules/FiniteSetsExt.tla#L43-L57

konnov commented 3 years ago

Yep. I thought about using kSubset but it produces a whole set of subsets. Do you like me to create a PR in Community modules?

konnov commented 3 years ago

On a second thought, I probably do not want to use CHOOSE there, as we need non-determinism. Maybe there is no way around using kSubset.

konnov commented 2 years ago

I think this could be easily implemented with Apalache.Gen:

TakeSet(S, k) ==
   IF Cardinality(S) >= k
   THEN LET Small == Gen(k) IN
        CHOOSE X \in { Small }: X \in SUBSET S 
   ELSE {}
konnov commented 2 months ago

Not really relevant