Refactor and improve tests for the builder library

The current tests use a lot of machinery around ScalaCheck that doesn't add any value afaict. On the contrary, it seem to make it considerably more difficult to read and understand the tests than it would be if we were using the usual PBT idioms advised by scalacheck. It is also difficult to know what has gone wrong when the tests fail, since the tests are doing too many things in each case and have undescriptive names.

Here is an example of a test case in the current approach:

https://github.com/informalsystems/apalache/blob/325f21c529307a0b9caa3d2311272ec9271d7a18/tlair/src/test/scala/at/forsyte/apalache/tla/typecomp/TestBaseBuilder.scala#L109-L159

Here are two cases in the approach I propose as an improvement:

https://github.com/informalsystems/apalache/blob/de453fe72282cab11ab3bfc78897ce16c75c0792/tlair/src/test/scala/at/forsyte/apalache/tla/typecomp/TestBaseBuilder.scala#L126-L154

Benefits of my proposed approach:

Less than half the length (and that's after the addition of clarifying comments and imports that wouldn't be included in the actual code)
Better coverage, since we test for arbitrary invalid types
More readable expression of the property being tested for
Clearer naming of the test cases, so it is clear what test case or property is failing

I'd like to see others weigh in on this too, but the tests you have proposed are not equivalent.

The following case is never addressed:

     ( 
         builder.name("x", tt), 
         builder.name("set", SetT1(tt)), 
         builder.name("p", InvalidTypeMethods.notBool), 
     )

(nameType != elemType) ==> throwsTBuilderTypeException(instruction) does not perform the same function as
```
( 
         builder.name("x", InvalidTypeMethods.differentFrom(tt)), 
         builder.name("set", SetT1(tt)), 
         builder.name("p", BoolT1), 
     ), 
     ( 
         builder.name("x", tt), 
         builder.name("set", SetT1(InvalidTypeMethods.differentFrom(tt))), 
         builder.name("p", BoolT1), 
     ), 
     ( 
         builder.name("x", tt), 
         builder.name("set", InvalidTypeMethods.notSet), 
         builder.name("p", BoolT1), 
     ), 
```
The more verbose collection above is a decomposition of the equivalence classes of builder failure. They are all instances of nameType != elemType, and every instance of nameType != elemType falls into one of the above buckets, however, unlike the probabilistic approach, they guarantee that all of these varied failure scenarios get tested with 100% probability. Also, typically each of these sub-cases corresponds to a different way in which e.g. the signature implementation could be wrong.
succeeds(instruction) only checks whether a type failure was raised during execution (IIRC). It notably does nothing to check that the expression constructed has the correct syntax for the given operator. For instance, if builder.forall actually constructed a TlaOper(TlaBoolOper.exists, ...) expression, which has the exact same type signature, the new tests would fail to detect this bug in any way.

Ultimately, by my assessment:

Less than half the length (and that's after the addition of clarifying comments and imports that wouldn't be included in the actual code)

Yes, because it skips structure checking, and because it collapses all failure scenarios into one, which I'm not sure you can even do for all operators.

Better coverage, since we test for arbitrary invalid types

Untrue, since the mkIllTyped tests are just a decomposition of (nameType != elemType) (for the example above), and Generators.singleTypeGen generates arbitrary types.

More readable expression of the property being tested for Clearer naming of the test cases, so it is clear what test case or property is failing

Subjective, but I'm not opposed to more verbose test names, or splintering away IllegalArgumentException tests.

apalache-mc / apalache

Refactor and improve tests for the builder library #2580