GaloisInc / crucible

Crucible is a library for symbolic simulation of imperative programs
677 stars 44 forks source link

`crucible-mir`: Support multiple `mir-json` JSON schema versions (within reasonable limits) #1253

Open RyanGlScott opened 1 month ago

RyanGlScott commented 1 month ago

Currently, crucible-mir is only able to parse and reason about a single snapshot of the mir-json JSON schema. This is a somewhat extreme limitation from a user perspective, however, as it means that tools which depend on crucible-mir (i.e., crux-mir and SAW) must know exactly which JSON schema their tool supports and install a matching version of mir-json. This gets even more unwieldy if you try to install nightly versions of crux-mir and SAW simultaneously, as they may have different, incompatible JSON schema requirements. (See https://github.com/GaloisInc/saw-script/issues/2111 for an example where this has bitten in the wild.)

(Note that at the moment, mir-json's JSON schema is implicitly versioned. Starting with https://github.com/GaloisInc/mir-json/pull/75, I hope to make this versioning explicit.)

Note that crucible-llvm is more flexible: it uses llvm-pretty-bc-parser, which is capable of parsing multiple LLVM versions and reifying them all into a single, consistent AST. While this does require more work on our end to maintain support for multiple versions, the payoff is worth it for users, as it gives them much more flexibility in picking LLVM versions. It would be great if we had similar flexibility with mir-json, especially given that it may change more frequently than LLVM does.

That being said, it is unclear whether we want to support all mir-json schema versions that have ever existed, especially given that mir-json may change versions more frequently. As such, I propose that we only support mir-json versions within a certain window—let's say the previous three mir-json schema versions. This should strike a nice balance of flexibility while not adding an especially undue amount of maintenance burden.

Once we have a mir-json JSON schema support window established, we can teach crucible-mir to check the schema version in a MIR JSON file, and if the version falls outside the support window, it should throw a proper error message rather than giving a confusing error message later down the line (as seen in https://github.com/GaloisInc/saw-script/issues/2111).

sauclovian-g commented 1 month ago

My suggestion would be to tie it to saw and/or crux releases: support the versions produced by the mir-json included with (or implicitly included with) supported saw and crux releases. Possibly with a proviso that schema versions that never appeared in a release (and thus only appeared in current/nightly builds) can be dropped after some fixed period of time, maybe a month.

This addresses two concerns that are bound to arise eventually: (1) not needing to recompile except on full updates (or at a reasonable interval when running nightly), and (2) not needing to keep every version for the long term if (realistically: when) at some point we discover that we blundered or made a mess.

RyanGlScott commented 1 month ago

My suggestion would be to tie it to saw and/or crux releases

That sounds reasonable to me! Perhaps we should support the previous two Crux/SAW releases at minimum, and we would be able to drop support for older schema versions whenever it's convenient.