tlaplus / rfcs

RFCs for changes to the TLA+ specification language
MIT License
13 stars 0 forks source link

Proposal to add module package management & versioning to TLA+ #8

Closed ahelwer closed 4 months ago

ahelwer commented 1 year ago

Currently there are four sources of external TLA+ specifications that can be imported:

  1. The standard modules (Sequences, FiniteSets, Naturals, etc.)
  2. Modules distributed with the toolbox (we might call these the extended standard modules, like Randomization.tla)
  3. The community modules
  4. The TLAPS modules

Recent experience validating the TLA+ examples repo has shown that changes in these dependencies are a source of bitrot. We should do what every other language has done and come up with a package management scheme where:

  1. Specs can declare a dependency on specific versions of each of these categories, independently
    • In the case of the community modules, every single spec should be independently versioned & addressable
  2. Specs included as dependencies can themselves recursively have specific versions of dependencies

There are a number of possible ways this could be achieved. The simplest would be to have an optional tlaplus.config file in the spec root directory encoding key/value pairs of module name and version number in some standard config format (json, toml, yaml, whatever). Tools would then be responsible for acquiring and using the correct module when they are run. This will have a number of challenges:

  1. Currently releases of the standard & extended-standard modules are coupled to releases of TLC. They should be decoupled, and whatever version of TLC is run should be able to use specified versions of the standard & extended-standard modules. It would also be necessary for TLAPS to implement this for its included modules.
  2. Diamond-shaped dependency version conflicts. There are a number of ways of handling this like namespace unification (C# style) where one of the dependencies is forced to use the version of the other and errors might manifest at runtime, or the node.js method where multiple versions of the same dependency can coexist.
  3. Hosting & update of community modules. This is currently done with a git repository. We can continue using this until the number of specifications become unwieldy or we start hitting github traffic limits (both unlikely to occur for quite a while). The easiest method of bolting versioning onto the community modules would be to add a top-level versioning.json file associating some version number of each module with a specific git commit hash. This has the drawback that updates to community modules require two commits, one to update the module and another to update the versioning.json file (since the commit hash of the first can't be known ahead of time). The versioning.json file would also need to pin dependency versions.
lemmy commented 1 year ago
  1. The standard modules (Sequences, FiniteSets, Naturals, etc.)
  2. Modules distributed with the toolbox (we might call these the extended standard modules, like Randomization.tla)

There is no technical distinction between options 1 and 2. Randomization.tla is a component of tla2tools.jar, just like all other standard modules. The sole difference is that option 1 includes a list of modules discussed in Specifying Systems.

Hosting & update of community modules. This is currently done with a git repository. We can continue using this until the number of specifications become unwieldy or we start hitting github traffic limits (both unlikely to occur for quite a while). The easiest method of bolting versioning onto the community modules would be to add a top-level versioning.json file associating some version number of each module with a specific git commit hash. This has the drawback that updates to community modules require two commits, one to update the module and another to update the versioning.json file (since the commit hash of the first can't be known ahead of time). The versioning.json file would also need to pin dependency versions.

Each module can define _MODULENAME_VER (convention), allowing the importing (extending or instantiating) module to assume a version. If we want to go more sophisticated than just a timestamps, a suitable "semver" operator could be defined.

Generally, the majority of issues we face today could be addressed by coordinating releases of the tools such as TLC, TLAPS, Apalache, .... Also, remember that in the case of TLC, several modules come with Java module overrides. Apalache also has a bunch of modules that are coupled with it.

ahelwer commented 4 months ago

Basically thinking we should just go the route of onboarding TLA+ to the nix/guix ecosystem for reproducible research instead of inventing our own packaging and versioning system.