informalsystems / modelator

Model-based testing tool
https://mbt.informal.systems
Apache License 2.0
53 stars 5 forks source link

MBT-core: architecture #2

Closed andrey-kuprianov closed 1 year ago

andrey-kuprianov commented 3 years ago

MBT-core implements the core, target language independent, functionality for model-based testing.

Figure out the architecture for the MBT core:

andrey-kuprianov commented 3 years ago

User interaction

There should be a binary (modelator), that the user can invoke in order to create / check / process / run the tests. This is the main target-language-independent point of entry into the toolchain. The tool should provide access to these functions:

For many purposes modelator can be considered as the Apalache wrapper, offering users MBT-specific ways to interact with the tool.

andrey-kuprianov commented 3 years ago

Use case 1: Generate example execution

Focuses on the simplest complete user interaction involving test configuration, instantiation, checking, and running Apalache

Scenario: The user has a TLA+ model, and a test assertion for which they want to produce an example execution. They provide the tool with the config file containing the test assertion, and possibly instantiation parameters, and ask to execute the test and give example (abstract) execution.

Happy path:

Possible failures (to become unit tests):

andrey-kuprianov commented 3 years ago

Use case 2: conformance test suite generation

Focuses on the Cosmos ecosystem user; besides UC1 involves the weights system and running/generating multiple tests

A conformance test suite includes a (large) set of tests (~hundreds), which are generated from:

Scenario The Cosmos developer wants to test their Tendermint/IBC/Cosmos component, for which there exists a TLA+ specification. They run the MBT-core toolchain with the given TLA+ spec and test configuration, and asks to store generated test executions into the specified location.

Happy path: as for UC1, and additionally

Possible failures (to become unit tests): as for UC1, and additionally

andrey-kuprianov commented 3 years ago

Test vectors, modules, artifacts, and metadata

This is an attempt to describe the test processing in MBT how I see it now.

Design goals

To satisfy the goals I propose to employ the following notions:

Test artifacts

Let's consider some examples of test artifacts:

Test vectors

A test vector is a compact description of how a particular test artifact was produced, i.e. of the input data and a sequence of modules that processed it. It is not the artifact itself, it is only enough information to precisely reproduce it. Test vectors are extensible. If the type of some test artifact is compatible with the input type of some test module, this artifact can be submitted to the module; the test vector of any of the resulting artifacts will be an extension of the current test vector.

Test metadata

Test metadata is a key-value map associated with a test artifact. In particular, each test artifact should include its type in the metadata. Any additional information is up to the module that produces the artifact; it should be now compact enough to be stored if needed, but useful either for automatic artifact processing, or for the user inspection.

Test modules

These are the processing units that accept artifacts as input, and produce artifacts as output. Here are some examples:

andrey-kuprianov commented 3 years ago

Test modules

A test module is a binary component that satisfies certain input-output relation, allowing to integrate it into the general MBT framework. Test modules are meant as primitive processing steps, transforming test artifacts into others.

Each module can provide a number of methods. The manifest method should always be present. When called without arguments, it should provide the manifest for module; when called with the name of the method, it should provide the manifest for that method.

Module manifest

Each test module has the associated manifest that includes:

E.g. for the module that runs Apalache, the manifest could look like that:

{
  "name": "apalache",
  "description": "This module provides functions for interacting with the Apalache model checker", 
  "version": "0.1.0",
  "components": {
    "modelator": "0.1.0",
    "apalache": "0.10.0",
   "java": "openjdk-11.0.10"
  },
  "methods": [ 
     { 
        "name": "check-tla",
       ....
     },
     // same for "check-json", "test-tla", "test-json", "tla-to-json", "json-to-tla" 
  ]
}

Method manifest

Each module's method has the associated manifest that includes:

Each element of inputs, results, or errors should come with its type. The types are just strings, with the associated meaning that is unified across the MBT toolchain.

E.g. for the Apalache test-json method, the manifest could look like that:

{
  "name": "test-json",
  "description": "This method runs Apalache on the given model and configuration parameters in JSON format, and produces an example execution", 
  "version": "0.1.0",
  "inputs": {
      "model": "json-tla",
      "tla-config": "json-tla-config",
      "apalache-config": "json-apalache-config"
  },
  "results": {
      "execution": "json-tla-counterexample",
      "output": "string"
  },
  "errors": {
      "output": "string",
      "log": "string"
  }
}

Input and output

For the input-output behaviour of test modules we adopt the JSON-RPC protocol to guarantee future extensibility and the possibility of applications in wider contexts.

Each module needs to accept JSON-RPC requests via STDIN. Each module should accept at least the manifest method.

As a result of invocation, the module should produce a JSON-RPC response on STDOUT, with either the result or the error set, with the corresponding components outlined in the manifest. STDERR output can be arbitrary information, e.g. for debugging, which can be captured by the framework, but formally speaking it's ignored.

andrey-kuprianov commented 3 years ago

User interaction

We want test system interactions with the user to be fun. Therefore we will organise the system in a way that allows the user to explore and experiment. Imagine the testing system to be a quest game, or a universe, which the user wants to explore in order to find the bugs that hide in their code. To implement this view we will have the following:

How this is going to be implemented:

andrey-kuprianov commented 3 years ago

Communication of core with plugins

In the current instantiations of MBT one of the main limitations is the necessity to store in the repo and maintain a large number of static test files, like in LightClient MBT tests. We want to eliminate that and generate the files on the fly.

For that, MBT-core will take care of configuring the toolchain, and generating the tests. The language plugins, such s MBT-Rust, on the other hand will only execute the tests in the target language. The interaction between the core and the plugins should be done via a simple communication protocol, executed e.g. via a socket connection:

Such structure simplifies MBT in several aspects:

andrey-kuprianov commented 3 years ago

Module requirements

Modules are the minimal processing steps that can be integrated into MBT. Here we list some requirements that MBT modules should satisfy (inspired by a resistance to #14):

Single-purpose module methods

A module should have methods that are single-purposed. E.g. in #14 there is RunMode option which offers a choice of whether to run a test, or to run an exploration. I believe that there should be two different methods of the tlc module, namely test and explore This will allow to relate to those individually, and to unambiguously integrate them into chains of modules.

Method determinism

See Reproducibility requirement above. I am not sure whether TLC with multiple workers is deterministic.

What we want to achieve is that module methods are like mathematical functions -- given the same version and same inputs, they will produce the same results. This will allow us not to store tests, but to generate them on the fly, and store only the TLA+ models and the minimal config describing what chain of modules to run to produce the tests.

Module calling conventions

14 replaces moderator-cli with modelator, which is in essence only the interface to run a model checker. I believe that instead it should be that TLC is a module, integrated into modulator-core, and callable via a generic module interface, dispatched from moderator-cli. It is still vague, but what we need to devise is approx. this: