Open thoughtpolice opened 4 years ago
Thanks for this carefully reasoned analysis. I think I mostly agree with you. In particular we really do need to fix our open source license compliance, and we absolutely do need to get a test suite up and running for this. Thankfully there's work being done in the background to open source Bluespec's test suite.
Someone else mentioned sbv to me as the 'standard Haskell SMT solver' before, so, dependencies and build issues allowing, I think we should use it.
I'll take a look at what sbv's transitive dependency graph is like. I do notice that sbv 8.1 and later (current is 8.6) require base 4.11 which means GHC 8.4 or later. That might be a little too new. However older versions of sbv (including the 7.12 in Debian) only require base 4.9 and GHC 8.0.
Also, @quark17 how often does bsc use the sat solver? If it's doing many many queries over the course of a compilation, running external executables for each one might be a significant performance overhead (particularly on macOS and Windows).
The thing about SBV is that it often does releases that contain GHC support "in tandem" with new features. For instance I wouldn't be surprised if sbv 7.12 didn't work on GHC 8.4 or later, for example. So if you want any features from the sbv 8.x series, or you simply want to allow users to use GHC 8.4 -- then you're out of luck. (In particular, Nixpkgs uses GHC 8.6 to build Bluespec, and I plan on submitting patches here to do Nix CI builds with GHC 8.6 as well.)
More generally, I think requiring GHC 8.4, or even GHC 8.6, or any "relatively recent" compiler, isn't really problematic (as I noted in some other ticket). The biggest sticking point is "can the user install that compiler?" but most distros follow a much more aggressive release schedule than LTS-oriented distros like Debian/Ubuntu/RHEL do where this would be an issue, and on those platforms there are often (supported) alternative ways of acquiring the needed compiler anyway. (Frankly, Ubuntu LTS is probably the easiest platform you can install various Haskell compilers on!)
Personally, I'd say the smart thing to do from a QA/burden/maintenance POV, is exactly CI and restrict "official" support to 1-2 compilers only (say, GHC 8.4 and 8.6, for now), and accept tentative patches for other users on an as-needed basis only. Bluespec (as a project) can bump the supported GHC version(s) once every 6-8 months, as needed. This is roughly the cadence at which the traditional Haskell community moves, and it makes dependencies like sbv
pretty easy to handle.
Also, I can say that once the user does have a supported compiler installed, Haskell tooling these days is generally quite good about reliable, repeatable builds. In particular, cabal new-build
(aka "cabal 3.0") are fairly game changing and mean anyone who can get the right GHC version can basically "instantly" do a build with a lockfile and get reproducible results. So I think the costs of constraining some support -- of adding support restrictions -- are well worth it here, because in return we can get pretty reliable and flexible builds for developer needs.
I'll also note that newer GHC features like "environment files" mean that we also do not have to opt into Cabal for building the bsc
binary, so the migration can be progressive. Rather, we can use Cabal to only populate the needed (fixed, locked) dependencies for bsc
, then just run ghc
ourselves like we do now. So we don't need to mess with the build system much.
Setup
Right now, Bluespec links directly into Yices or STP for doing solving of various bitvector-like constraints. Or something like that. The default for now is Yices, and STP is optional, and we link to their object files, and that's really what matters.
However, doing this has a number of downsides:
We must vendor
yices
and possibly other packages likestep
. We should be moving away from this. It requires tooling like submodules, makes build integration more complicated, and ultimately third party systems already package things like this.More substantially, linking to the Yices API means the Bluespec compiler is a derived work under the GPLv3, and also, any codebase derived from
bsc
will be, too. It's important to note that the code ofbsc
itself can fully be BSD2, but it is the linking part that actually produces the derived work (and implies a requirement to the source code, for resulting binaries, under the derivative/GPL clause). IANAL, but this is my understanding.The first is problematic for a number of technical reasons (submodules, etc) but the second really hurts. People are also prone to get strange about licensing issues and may find this "deceptive" (because you advertise the codebase as BSD3, but some results are under the GPLv3 by law.)
Proposed solution
I think it should be possible to avoid all this by doing something else instead: most modern SMT solvers (including Yices, Z3, STP from what I can tell, Boolector, and more) support formats like SMT-Lib. This is a project-neutral standard format you can feed a solver and have it produce answers for you.
The most prominent library for this is sbv in my opinion:
yices.exe
,z3.exe
) in a cross platform manner, feeding them problem sets, and returning the results encoded as Haskell values.These combination of features, IMO, make it the most worthy route to investigate, and feature 3 in particular not only breaks the GPLv3-by-linking problem, but opens the door to other solvers (
z3
,boolector
) etc as well.For anyone who wants to tackle this, I suggest looking for uses of
Yices
in the compiler codebase. The actual amount of uses of the Yices FFI library are small, so I'd suggest trying to surgically replace what we have with SBV, to the extent possible. We can then just fix the SBV solve function to use Yices for now.Upsides
Problems and risks
Problem: This is a potentially behavior-inducing change, and requires testing. Solution: We MUST have the testsuite available, and have it working in CI. This I feel is really important and can't be worked around. Note: Perhaps if the change is small and easy enough, Julie could run tests for us to let it go through.
Problem: SBV has many several dependencies. These aren't too bad, but in general
comp
has a very spartan set of dependencies right now. Going forward, though, it seems unlikely to continue to the same degree (~10 years of GHC support is a bit much. :) Solution: Doing this will probably require us to move to a new mechanism for integrating Haskell dependencies, for example, usingcabal new-build
with Cabal 3.0 to build packages, and this will probably have impacts on the CI and build process as a result. We'll need to actually pick a supported compiler range, and show people how to install it and the tools. This can be done pretty easily for Debian with Herbert's PPA, though RHEL needs some review since I'm not familiar with it. (This will pave the way to producing.rpm
and.deb
packages, and so it should happen before doing that.)Problem: We probably should test solvers a bit. Joe Schmo's random SMT solver from GitHub probably isn't worth doing QA or testing for. Yices, Z3, etc all seem reasonable. Solution: The CI could reasonably test multiple solvers if it's fast enough, but who knows about this. Solver edge cases can definitely be supported as regression tests. We should pick a minimum set of solvers and tell users to stick with them.