Closed PhilippWendler closed 8 years ago
The visitor that exist in CPAchecker do not handle quantifiers, this needs to be added.
Also would it be hard to change the constructor of BooleanFormulaTransformationManager
to require a BooleanFormulaManager
instead of FormulaManager
? Latter offers no advantages, and with that the class can not be called from BooleanFormulaManager
(as it does not have FormulaManager
instance handy)
For handling Quantifiers, we need a QuantifierFormulaManager, and instead of using two parameters BooleanFormulaManager and QuantifierFormulaManager, we can access both via FormulaManager. I prefer keeping the whole FormulaManager.
Ah true, that's a good point. Should the traversal expose quantified formulas somehow then? E.g. visitExists(bound, formula)
EDIT: disregard, saw Philipp's comment above.
Question: why does the comment say that no guarantees are made on the iteration order? Surely it just traverses the whole formula using DFS?
We have Visitors, that cache results.
Additionally, we have no guaranteed order for the operands of AND(op1,op2,op3)
, as the solver can simplify the term if one of the args is true
.
So we can not guarantee DFS.
I think it's partly a terminology issue. Maybe "topological" order would be better?
that cache results
That doesn't change the traversal order, they just have a very short traversal: check cache, return.
we have no guaranteed order for the operands of AND
I would still call it DFS for any traversal order of AND- or OR- arguments.
In the end of the day, at least to me, an "arbitrary order" means that the traversal can pick up any argument at any point from the "worklist". This is an important difference, as in our case if the solver starts processing the node, it will finish processing its children before jumping to other nodes. This allows state-full methods (keeping traversal state in class fields) In general case, no such guarantee is possible.
For the portfolio solver (#8), even a guarantee for a topological iteration would be problematic, though.
I would think that a stateful visitor that relies on the iteration order should be avoided if possible anyway, as it seems very error prone. I don't think we even need it in CPAchecker, or do we? Visitors that extract things like variables and atoms are stateful but do not rely on iteration order, and all kinds of transformation like renaming variable are handled by the BooleanFormulaTransformationVisitor, which is also safe.
even a guarantee for a topological iteration would be problematic, though.
Would it? Just apply it to all formulas one by one.
We need to stop code duplication with CPAchecker visitors...
One problem is that JavaSMT visitors require a FormulaManager
,
while CPAchecker ones take in a FormulaManagerView
.
We can do the following:
create*Environment
out of FormulaManager
.FormulaManagerView
implement FormulaManager
UnsupportedOperation
for getUnsafeFormulaManager
if we really want to stop users from accessing it (though I would let it through for now until we can get rid of all of its usages...)We should not expose UnsafeFormulaManager
from FormulaManagerView
, otherwise people will start using it.
Why are the create*Environment
methods related to this?
The following cases should be handled in the visitors:
@PhilippWendler
FormulaManagerView
on getUnsafeFormulaManager
and mark it deprecated for IDE support.create*Environment
methods are related because FormulaManager
interface implements them.
If we make FormulaManagerView
implement FormulaManager
(like other views do), we'll have to implement them as well.Otherwise we are stuck with code duplication in visitors (the ones in CPAchecker want FormulaManagerView
and the ones in JavaSMT
want FormulaManager
).
There are a few usages of UnsafeFormulaManager
and raw introspection methods in CPAchecker code, and I don't think they can be changed to visitors nicely.
Solver
class there is a helper method getUnsatCore
I have used many times; It finds an UNSAT core, but first it tries to "unravel" the formula as much as it can -- eg. AND(AND(A, B), AND(C, D))
gets turned into AND(A, B, C, D)
and the unsat core is searched among the subsets of A, B, C, D
. Is visitor suitable here? if (bfmgr.isAnd(constraint)) {
for (int k = 0; k < ufmgr.getArity(constraint); k++) {
addConstraint((BooleanFormula)ufmgr.getArg(constraint, k),
prover, ufmgr);
UF
s and manually perform Ackermannization on them --- how that can be done without isUF
calls?In the Solver class there is a helper method getUnsatCore I have used many times; It finds an UNSAT core, but first it tries to "unravel" the formula as much as it can -- eg. AND(AND(A, B), AND(C, D)) gets turned into AND(A, B, C, D) and the unsat core is searched among the subsets of A, B, C, D. Is visitor suitable here?
Should be doable with a visitor that collects or pushes all non-And terms. Probably this code should even be moved to JavaSMT.
In my policy iteration code I find UFs and manually perform Ackermannization on them --- how that can be done without isUF calls?
Ackermannization should be moved to JavaSMT as well?
BooleanFormulaVisitor does not replace isUF
, but a new FormulaVisitor
with methods for UFs, variables, and quantifiers, should replace it.
BTW why do visitX
methods are public? Shouldn't they be protected?
Yeah, protected would be nice, but then we cannot access them from BooleanFormulaManager, as it is another package.
@kfriedberger yes you're right, at least visitDefault
can be protected though.
@PhilippWendler
Probably this code should even be moved to JavaSMT.
It's not feasible to add as a tactic every possible transformation we can think of. And users will think of many others.
splitEqualityIfPossible
.
But on the other hand the list of all possible operations is huge and not even well-defined.It's not feasible to add as a tactic every possible transformation we can think of. And users will think of many others.
I agree, though I would think that not this tactic but this getUnsatCore
method will be helpful and generally useful.
For visitors which are allowed to go inside boolean atoms, would we supply the client with any way to know what operation this is? This would be required to implement things like splitEqualityIfPossible. But on the other hand the list of all possible operations is huge and not even well-defined.
That's the big open question.
Probably the list of operators is well defined (by SMTLIB), but still too large to be useful. One option would be to have a visitOperator(String name, List<Formula> operands)
method (with name being the SMTLIB-compliant name).
I think it would be good if we can distinguish between operators and UFs, but syntactically they look the same and I do not know whether every solver supports distinguishing between them without hard-coding a list of all operators.
Thus my current preference for this visitor interface looks like this:
public interface FormulaVisitor<R> {
R visitFreeVariable(String name, FormulaType type);
R visitBoundVariable(String name, FormulaType type);
R visitLiteral(??? literal, FormulaType type);
R visitUF(String name, List<Formula> args, FormulaType type);
R visitOperator(String name, List<Formula> args, FormulaType type);
R visitForallQuantifier(List<Formula> variables, BooleanFormula body);
R visitExistsQuantifier(List<Formula> variables, BooleanFormula body);
}
This should allow to do everything the current API allows, right?
Maybe there is a valid reason to add <T extends Formula> visitIfThenElse(BooleanFormula cond, T thenBranch, T elseBranch);
.
For recreating the same formula while traversing one would have to add a corresponding "makeOperator(String name, ...)" method though.
Suggestions for a better interface are welcome.
Probably the list of operators is well defined (by SMT-LIB)
I think the problem is that solvers have their own extensions. Check out http://smtlib.cs.uiowa.edu/theories.shtml. It doesn't have strings or datastructures, but Z3 has both of them (though we don't support those). The question is whether Java-SMT should support only a subset of SMT-LIB, but e.g. we support interpolation, which isn't actually defined by SMT-LIB.
I think it would be good if we can distinguish between operators and UFs
Should we? For practical purposes, what is the difference between an operator you don't know and a UF? Applying an operator creates a stronger formula, but it has all the properties UF has. And I think the name is "function application" and not an "operator".
visitLiteral
Do you mean things like numerals, eg "42"?
visitOperator
I would prefer visitFunction
visitBoundVariable
Maybe should provide a pointer to the binding quantifier? Or some way to get that access?
visitOperator(String name, List
args, FormulaType type)
How do we make sure names are consistent across different solvers?
I think typecasting between different theories can become a pain point (all the integer-to-rational
functions).
visitFreeVariable
On one hand this shortcut makes sense, but on the other hand a variable is just a function with no arguments.
Yes, with literal I mean what you call numeral. Not sure what is more common, in programming context I think it is literal. MathSAT just calls this "number".
visitFreeVariable could maybe be useful, I guess many implementors will do something different than for UFs. The default visitor could map this call to visitUF. It is also nice for symmetry to visitBoundVariable.
For visitOperator name consistency, we cannot do better than SMTLib I guess. This is essentially a replacement for UnsafeFormulaManager.getName()
, which suffers from the same problem. I hope people will not use this method and actually check the given name, but use it for more generic things like re-creating the same operator inside a FormulaTransformationVisitor. Or should we add explicit support for this use case instead?
For UFs, I think the distinction between operators and UFs can be quite convenient, we have a lot of callers to isUF
I think. For example visitors that apply a renaming of UFs for example should never change operators, thus it can be useful to separate them.
Yes, function application is the correct term, but if we separate UFs and operators, we cannot use it.
but use it for more generic things like re-creating the same operator inside a FormulaTransformationVisitor
How can one re-create an operator given only a String (its name)?
For now I would just do an implementation which does the same as UnsafeFormulaManager#getName
as we really can't do much better without investing too much time...
Yes, function application is the correct term, but if we separate UFs and operators, we cannot use it.
That's a bit pedantic, I think if there is a method visitUF
it's obvious that "function" means "interpreted function".
For a literal we can use a String
serialization in place of ????
.
It is always fairly cheap to serialize bitvector/rational/floating point/integer to and from strings.
Actually I've meant for the numeral :) I'm very much -1 on using the term "literal" as in logic context it has a well defined meaning: an atom or its negation.
How can one re-create an operator given only a String (its name)?
I mentioned that this probably needs to be added. I do not like it but I see no other way (we have it currently in UnsafeFormulaManager
).
For now I would just do an implementation which does the same as UnsafeFormulaManager#getName as we really can't do much better without investing too much time...
How would you implement a visitor that for example renames all variables then? There needs to be a way to re-create the same higher-level AST node with changed arguments.
For a literal we can use a String serialization in place of ????.
String can be very inconvenient if you actually want to do something with this number. Types like bitvectors and floating point have complex number formats that cannot be trivially converted into something usable and users would need to write their own parsing for this.
How would you implement a visitor that for example renames all variables then?
We can pass an extra argument Function<List<Formula>, Formula>
which would create a function application from the provided arguments.
How would you implement a visitor that for example renames all variables then?
We can pass an extra argument
Function<List<Formula>, Formula>
which would create a function application from the provided arguments.
Nice idea.
and users would need to write their own parsing for this
Not if we provide the corresponding classes with a convenient of(String s)
method.
Currently we can construct a floating point number from Rational/double/BigDecimal classes.
I don't think that any of those have 1-1 mapping with SMT floats (floats represented in a given theory may be very different from the double
in Java), and a custom class may be needed.
I think the same applies to bitvectors.
Coming back to the parent's comment:
R visitExistsQuantifier(List
variables, BooleanFormula body);
Why are these methods here? Shouldn't they be in BooleanFormulaVisitor
and shouldn't a FormulaVisitor
inherit from it? Otherwise, where are all the other methods for exploring the boolean structure of the formula?
Nice idea.
That might make writing constructors even more verbose than they are currently are.
Other approach would be providing constructors on request from function names through eg FormulaManager
something like
public Function<List<Formula>, Formula> getConstructorForFunction(String functionName) {};
Thus my version currently is:
public abstract class FormulaVisitor<R> extends BooleanFormulaVisitor<R> {
abstract R visitFreeVariable(String name, FormulaType type);
abstract R visitBoundVariable(String name, FormulaType type, Formula boundingAST);
abstract R visitNumeral(String numeral, FormulaType type);
abstract R visitUF(String functionName, List<Formula> args, FormulaType type);
abstract R visitFunctionApplication(
String functionName,
List<Formula> args,
FormulaType type,
Function<List<Formula>, Formula> newApplicationConstructor);
}
with constructors optionally moved to FormulaManager
.
I do not understand what you mean by "constructor".
A FormulaManager.getConstructorForFunction
method would be the same as a method that allows to create any function application by giving a string.
I did not intend to have the two visitors extend from each other. One allows to traverse the boolean part, with semantics of function applications, and the other allows to traverse all formulas, but without any semantics of function applications. This matches the use cases we have so far.
visitBoundVariable
Maybe should provide a pointer to the binding quantifier? Or some way to get that access?
I do not know, but this could be quite difficult to support for some solvers.
I did not intend to have the two visitors extend from each other
If the visitor is only meant to be applied on non-boolean ASTs then it should not see any quantifiers.
But if we want it to apply it on any kind of ASTs I don't see any reason not to extend it from the boolean visitor. The fact that you had to duplicate the functions from it suggests that we should.
It is certainly intended to be applicable to all kinds of ASTs. I do not see any reason for inheritance. Why should the boolean functions be handled by different methods than the non-boolean one? Either I want to iterate syntactically over the AST without caring for what functions I see, then I do not need special methods, or I care about specific operators, then I cannot traverse non-boolean formulas. Quantifiers are different because they are syntactically different (they would not be handled by visitFunctionApplication
).
Implementations of the visitor would be considerably easier as they have a lot less methods.
The existence of visitAtom
in BooleanFormulaVisitor
would be a problem if FormulaVisitor
inherits from BooleanFormulaVisitor
. What would its semantics be? When would it be called?
OK point taken on inheritance.
Where from should this visitor be invoked? FormulaManager itself? I suppose not the UnsafeFormulaManager since that's what we are trying to hide.
Then the interface becomes
public abstract class FormulaVisitor<R> {
abstract R visitFreeVariable(String name, FormulaType type);
abstract R visitBoundVariable(String name, FormulaType type, Formula boundingAST);
abstract R visitNumeral(String numeral, FormulaType type);
abstract R visitUF(String functionName, List<Formula> args, FormulaType type);
abstract R visitFunctionApplication(
String functionName,
List<Formula> args,
FormulaType type,
Function<List<Formula>, Formula> newApplicationConstructor);
abstract R visitForallQuantifier(List<Formula> variables, BooleanFormula body);
abstract R visitExistsQuantifier(List<Formula> variables, BooleanFormula body);
}
The place of the visit method depends on #13 I guess.
The visitUF
method should maybe get an UninterpretedFunctionDeclaration
instance (with name field added to that class?)?
I've hacked together a quick call to visitor from Z3 to get a feel for what the API would look like. I have noticed we are missing the renaming functionality, which probably can be integrated into the constructor.
The code looks like this (inside Z3UnsafeFormulaManager
):
public FormulaType getFormulaType(Long f) {
// TODO
// TODO: do we support the parametrization of FormulaType?
return null;
}
public <R> R visit(FormulaVisitor<R> visitor, final Long f) {
String name = getName(f);
final FormulaType type = getFormulaType(f);
List<Formula> args = new ArrayList<>();
for (int i=0; i<getArity(f); i++) {
// TODO: arguments might have a different typ
long arg = getArg(f, i);
FormulaType argumentType = getFormulaType(arg);
args.add(getFormulaCreator().encapsulate(argumentType, arg));
}
if (isFreeVariable(f)) {
return visitor.visitFreeVariable(name, type);
} else if (isBoundVariable(f)) {
// TODO: is it feasible to get the bounding AST?
return visitor.visitBoundVariable(name, type, null);
} else if (isNumber(f)) {
return visitor.visitNumeral(ast_to_string(z3context, f), type);
} else if (isUF(f)) {
return visitor.visitUF(name, args, type);
} else if (get_ast_kind(z3context, f) == Z3_QUANTIFIER_AST) {
// TODO: get bound arguments.
if (is_quantifier_forall(z3context, f)) {
return visitor.visitForAllQuantifier(args,
getFormulaCreator().encapsulateBoolean(get_quantifier_body(z3context, f))
);
} else {
return visitor.visitExistsQuantifier(args,
getFormulaCreator().encapsulateBoolean(get_quantifier_body(z3context, f))
);
}
} else {
// Any function application.
Function<List<Formula>, Formula> constructor = new Function<List<Formula>, Formula>() {
@Override
public Formula apply(List<Formula> formulas) {
return replaceArgs(getFormulaCreator().encapsulate(type, f),
formulas);
}
};
return visitor.visitFunctionApplication(
name, args, type, constructor);
}
}
The visitUF method should maybe get an UninterpretedFunctionDeclaration instance
I don't know, TBH I have never used this class... I guess yes if it's easy to do so.
Regarding the formula type: we don't have facilities to provide the formula type for all arguments of the function, without making the API too hard to use.
Maybe we shouldn't provide FormulaType
at all and instead provide a function Formula
-> FormulaType
?
Regarding the formula type: we don't have facilities to provide the formula type for all arguments of the function, without making the API too hard to use.
I do not understand? What is "the function"? What exactly is hard to use?
Maybe we shouldn't provide FormulaType at all and instead provide a function Formula -> FormulaType?
We have that function already, but inside the visitor one does not have the current formula instance.
We have that function already, but inside the visitor one does not have the current formula instance.
Right!
Okay, here is an updated code:
public <R> R visit(FormulaVisitor<R> visitor, final Long f) {
String name = getName(f);
switch (get_ast_kind(z3context, f)) {
case Z3_NUMERAL_AST:
return visitor.visitNumeral(
ast_to_string(z3context, f),
getFormulaCreator().getFormulaType(f)
);
case Z3_APP_AST:
final FormulaType type = getFormulaCreator().getFormulaType(f);
List<Formula> args = new ArrayList<>();
for (int i=0; i<getArity(f); i++) {
long arg = getArg(f, i);
FormulaType argumentType = getFormulaCreator().getFormulaType(arg);
args.add(getFormulaCreator().encapsulate(argumentType, arg));
}
if (isUF(f)) {
// Special casing for UFs.
return visitor.visitUF(name, args, type);
} else {
// Any function application.
Function<List<Formula>, Formula> constructor =
new Function<List<Formula>, Formula>() {
@Override
public Formula apply(List<Formula> formulas) {
return replaceArgs(getFormulaCreator().encapsulate(type, f),
formulas);
}
};
return visitor.visitFunctionApplication(
name, args, type, constructor);
}
case Z3_VAR_AST:
return visitor.visitFreeVariable(name,
getFormulaCreator().getFormulaType(f));
case Z3_QUANTIFIER_AST:
BooleanFormula body = getFormulaCreator().encapsulateBoolean(
get_quantifier_body(z3context, f));
List<Formula> qargs = new ArrayList<>();
for (int i=0; i<get_quantifier_num_bound(z3context, f); i++) {
long arg = get_quantifier_pattern_ast(z3context, f, i);
FormulaType argumentType = getFormulaCreator().getFormulaType(arg);
qargs.add(getFormulaCreator().encapsulate(argumentType, arg));
}
if (is_quantifier_forall(z3context, f)) {
return visitor.visitForAllQuantifier(qargs, body);
} else {
return visitor.visitExistsQuantifier(qargs, body);
}
case Z3_SORT_AST:
case Z3_FUNC_DECL_AST:
case Z3_UNKNOWN_AST:
default:
throw new UnsupportedOperationException("Input should be a formula AST, " +
"got unexpected type instead");
}
}
I think I'll push it after I'll add some tests.
Oh right, the renaming point:
replaceArgsAndName
)?For a bound variable there definitely should be a way to identify a quantifier AST, otherwise it would not be possible to e.g. rename bound variables, as the quantifier AST which binds them is also affected by the rename.
What about theory for arrays? Strings? Custom datatypes? Should we delegate all unknown cases to visitFunctionApplication
?
Also in the spirit of cutting the boilerplate, I would make the method names shorter.
visitExistsQuantifier
-> visitExists
, visitForallQuantifier
-> visitForAll
. I think that "exists" and "for all" are pretty obvious in the logic context and we don't need to additionally specify what those are.visitFunctionApplication
-> visitFunction
. Again I think that is obvious that we are visiting the application, and also more consistent with visitUF
(otherwise why one is application and one isn't?).And yet another issue:
If we talk about FormulaTransformationVisitor
how to handle classes which would return a different output for a quantified variable? They should be either untouched or renamed altogether.
The following cases should be handled in the visitors
I would actually close the feature set on boolean visitors now. We don't need to handle every possible case. After all, SMT atoms are not really atomic, and all kinds of "logical" things can happen inside, and if we miss a few operators it is not a problem. Thus I'm against supporting XOR, distinct and boolean variables as special cases, as it leads to even more boilerplate.
These exist already in CPAchecker and should be moved to JavaSMT.