kino-mc / rsmt2

A generic library to interact with SMT-LIB 2 compliant solvers running in a separate system process, such as Z3 and CVC4.
Apache License 2.0
65 stars 14 forks source link

Decouple supported SMT solvers #8

Closed Robbepop closed 5 years ago

Robbepop commented 6 years ago

As far as I understood the crate, it enables SMT solver front-ends to use a diverse set of SMT solvers implementing its interfaces. This is awesome and I would love to implement this for my SMT solver to use its SMTLibv2 parsing support so that I do not have to reinvent the wheel another time.

However, it seems that the SMT solvers that implement the interfaces of rsmt2 are hard-wired into the library. Would it be possible to separate implementors (Z3, CVC4) from the actual interfaces?

The end result would be like that:

A client would then simply depend on one of the rsmt2-* crates for a direct support of the associated SMT solver.

I think you got the idea. :) So the question: Is this even possible?

AdrienChampion commented 6 years ago

Thank you for your interest in rsmt2!

The short answer is that what you're asking for is the very goal of this library, and this is already supported in a more streamlined fashion than what you describe, at least in theory.

However, it seems that the SMT solvers that implement the interfaces of rsmt2 are hard-wired into the library.

They're actually not hard-wired at all, rsmt2 simply spawns a solver process using whatever options you give it and then communicates through system pipes, printing/parsing SMT-LIB 2.5~2.6 commands and answers.

So in theory, as long as your solver respects the SMT-LIB 2.* standard and can read from stdin, you don't have anything to do. The idea is that since as a rsmt2 user you don't really have to worry about which solver is actually used under the hood, you can let your end-users chose it. Which in my experience is a pretty convenient feature.

Anyway, in practice things are a bit different.

Options

The first difference is that different solver will want different options to do different things. CVC4 wants the -m option for producing models and --incremental for push/pop while Z3 needs to be passed -in to read from stdin (which rsmt2 needs). One way to deal with this is to ask users to create an SmtConf completely with the right options for stdin and models and whatever they want to use.

While users can specify whatever options they want when creating the SmtConf, the current (internal) workflow is solver-aware to some extent (see below) and knows/should know about most of these special options.

Commands

The second problem is that different solvers tend to implement the SMT-LIB standard slightly differently. Until recently for instance Z3 only supported the check-sat-assuming command through the usual check-sat command. That is, the SMT-LIB

(check-sat-assuming ( actlit_1 actlit_2 ))

would yield an error while the non-SMT-LIB

(check-sat ( actlit_1 actlit_2))

would work.

Also, CVC4 used to return models differently from Z3, without the define-fun and without giving the type of the symbol being printed. Or that was MathSAT and not CVC4 I'm not sure anymore.

(Current) Solution

The current approach is to use a SolverStyle enum (invisible to users ATM, because only Z3 is officially supported) to differentiate between solvers.

(Eventually SolverStyle should be visible, but I was waiting requests for supporting other solvers before making it public. Thanks again by the way!)

The idea is that whatever special option or special syntax (for models, check-sat-assuming, etc.) would be attached to this enum and rsmt2 would use that at runtime to know how to spawn and interact with the solver.

The annoying thing is that as a user you need to know what kind of solver you're using, but as a result you don't need to worry about the basic options too much. The idea is that when users create a binary with rsmt2, I expect they will provide a way for end users to use different solvers, as I do in my Horn clause solver. To me, you just want end users (of the binary) to pass something like --cvc4 my_cvc4_version or --z3 my_z3_version and expect it to work. As opposed to

your_super_binary --cvc4 "my_cvc4_version -m --incremental" ...
# Users of the binary have to know/discover you use get-model and push/pop

and

your_super_binary --z3 "my_z3_version -smt2 -in"
# Users of the binary have to know/discover z3 does not read from stdin by default
# Also, (I'm pretty sure) older version of z3 don't read their input in SMT-LIB 2 by default

I think having a uniform API at SmtConf-level with conf.incremental() and conf.models() setters is more straightforward and requires less from end-users.

This is just my vision though, if you disagree with this usage do let me know and let's talk about it :)

Long story short there should be no need to create specific crates for each solver, just enrich SolverStyle with whatever solver you might want --- PRs are welcomed :D --- and whatever special options/syntax they use.

I'm not saying this workflow is perfect, and I'm completely open for better ideas.

AdrienChampion commented 6 years ago

I will try my best to add support for CVC4 soon-ish since (thanks to you) I just realized it's now much closer to Z3 commands/answers-wise than it used to be.

As a result SolverStyle will be exposed and this will serve as an example for adding support for new solvers.

Robbepop commented 6 years ago

Thank you very much for your fast and comprehensive response!

I am not sure if I understand enough about the SMT interfaces of concrete solvers to summarize my ideas of my visions. So please take all my suggestions with a grain of salt. :)

You told that all solvers behave differently with their respective CLI, and rsmt2 implements the way each solver behaves according to the SMTLib2 and rsmt2 provided interfaces. My only suggestion was to encapsulate this solver specific implementation detail in sub-crates instead of this library directly. This way I can add support for another SMT solver without touching rsmt2. This is important for the ecosystem to prevent rsmt2 having to update after another SMT solver support was added.

So in the end result rsmt2 would not know anything about any specific solver details, not even an enum that knows what solvers are actually supported. It would only have parsing support for SMTLib2 and provides the interfaces that solvers have to implement to be used through rsmt2. All the implementation details would then be located in the specific rsmt2-z3, rsmt2-cvc4 and rsmt-stevia for example.

I hope you got the idea and I hope I got it all right. :)

AdrienChampion commented 6 years ago

Okay, I see what you're saying, I'll think about it and get back to you :)

Robbepop commented 6 years ago

No problem, and please take everything I said with a grain of salt since I am not sure if I have understood the problem you want to solve with rsmt2, either. ;)

As far as I understood it my suggestion isn't very far away from the current project. It just pin points to a slight different direction in how to achieve the same things.

With my suggested decoupling the ecosystem would be more stable.

Adding a new SMT solver implementation or changing the implementation of another one would require no updates to rsmt2 and thus no dependency updates to any of its clients. This prevents dependency hell.

AdrienChampion commented 6 years ago

I haven't forgotten about you :)

I going to give it a try, by creating a different crate, probably named rsmt. rsmt2 is not a good name anyway, I don't want to have to create another library when SMT-LIB 3 comes out...

I'm still worried about a few things regarding your suggestion. I understand having to update rsmt2 whenever a new solver is added can be tedious to some (I personally don't mind, especially since there's not that many solvers around). But with this new workflow, changes in the main crate's API will potentially break all other crates. Arguably, that's only a problem until some stable API is reached so it's probably worth it in the long run.

But, because of the lack of feedback, I'm not sure how convenient/natural the current workflow is. So cutting the crate in several crates is a bit dangerous as the rsmt API will probably change, potentially in huge ways...

Anyway, I'll try to find something generic enough that makes sense.

Robbepop commented 6 years ago

You might be right that this might become a huge project and you also might be right about the fact that the SMT solver area isn't very big. I do not want to enforce this idea since it involves a lot of work especially on your side and especially if done right. There are countless theories that should be supported in the end. For my personal project I simply want access to an efficient SMTLib2 parser that is flexible enough and has enough customization points available to let me integrate it efficiently with my own data structures or at least without performance penalties upon using them.

It is up to you whether you want to actually start this project. I think the suggested format has advantages over the current state, yet it may not be worth the work as you have mentioned already.

I like having standardized and ecosystem wide APIs - so I favor your library over me implementing my own stuff as all other SMT solver did in the past. The existence of SMTLib2 is a good thing and so is the existence of a flexible ecosystem wide wrapper around it. :)

AdrienChampion commented 5 years ago

Let me go ahead and close this for a mix of

(Mostly the second reason though.)

The OP and newcomers are welcome to re-open this issue or create a new one to +1 this idea.