Closed alexandernutz closed 5 years ago
A related issue is that we do not support initialization of variable length arrays with our current method. This should be dealt with by all three proposed approaches.
If no one intervenes I will start by introducing a special function for array initialization. The first implementation of that function will be approach 1 (with while loops).
We discussed that switching between the three approaches should be easy. Therefore I am all for implementing the one that is easiest right now.
Update after discussion with @Heizmann :
I am currently looking into this ticket. I think I know a way but it requires some changes to memory model and solver infrastructure that I want to discuss it here.
Situation:
int a[3];
we will generate Boogie code along these lines for this:
#a := malloc(3 * sizeofInt);
writeInt(0, {#a!base, 0 * sizeofInt});
writeInt(0, {#a!base, 1 * sizeofInt});
writeInt(0, {#a!base, 2 * sizeofInt});
I would like to generate the following Boogie code instead:
#a := malloc(3 * sizeofInt);
#memory_int := #memory_int[#a!base := const-Array(0)]
(i.e. inside the two-dimensional array #memory_int
, I would like to assign a constant array to position #a!base
)
What needs to be done:
const-Array
This was discussed in #329, so conceptually I know which Boogie code to insert.
However currently there is only a hack. (See SolverBuilder#ENABLE_Z3_CONSTANT_ARRAYS
)
I will need to do changes to the C translation (probably during declaration of memory model infrastructure) and in SolverBuilder
.
I have rough ideas how to do this, but comments and caveats are welcome.#memory_int
, namely
var #memory_int : [$Pointer$]int;
is not compatible with the assignment as written above.
The effective type of #memory_int
is now [int, int] int
, but for the assignment to work, we need [int][int] int
.
Any good ideas how this could be tackled? Or am I missing something?The effective type of
#memory_int
is now[int, int] int
, but for the assignment to work, we need[int][int] int
.
In my opinion, this is the main problem for the constant array optimization.
I agree.
Also, note that it gets even messier for #memory_Pointer
, which is of type [Pointer] Pointer
at the moment. I'm not even sure what the solution is for that. Two arrays, one for offset, one for base?
In any case, one difficulty are lines like
requires value[ptr] == 0;
This would have to be replaced by
requires value[ptr!base][ptr!offset] == 0;
--> perhaps an alternative would be to have the BoogiePreprocessor do that translation.
I think there is a misunderstanding.
I do NOT propose to replace [int, int] int
by [int][int] int
. I just wanted to agree that the type [int, int] int
prevents us from implementing a very simple solution.
I would propose the following.
I do not understand why there is a hack that already enables this functionality with z3 but no solution for the incompatible signatures. How does the hack work?
Ok, here is my proposal after some offline discussions with @danieldietsch and @jhoenicke : I am not totally sure, which of @Heizmann 's letters it corresponds to, if any. Perhaps B?
Solution for Problem 1:
SMTPrelude
. SolverBuilder.buildAndInitializeSolver
takes an SMTPrelude
object as an optional argument.SMTPrelude
contains SMT commands of the type declare-fun
and define-fun
SolverBuilder.buildAndInitializeSolver
enters the commands from SMTPrelude
in the newly created script (as the hack did before)Solution for Problem 2:
mem_int : [Pointer] int
do the following,
function { :builtin initializeToZero_mem_int} (array : [Pointer] int, baseAddress : int) returns (res : [Pointer] : int);
(define-fun initializeToZero_mem_int
((array (Array Int (Array Int Int))) (baseAddress Int)) (Array Int (Array Int Int))
(store array baseAddress ((as const (Array Int Int)) 0)))
Overall this seems to be the simplest solution to me. One benefit is that the current builtin directive should suffice and there have to be no Boogie infrastructure or the memory model of the translation.
You can easily pass the prelude object via the toolchain storage. I would try and ensure that there is only one prelude object for a toolchain, so that if other plugins also want to add something, they add it to this object.
One example for a toolchain storage is CodeBlockFactory
. It
IStorable
interfacegetFactory(IToolchainStorage)
and storeFactory(IToolchainStorage)
You can just replicate this pattern in the SMTPrelude
object.
If you want to allow multiple SMTPrelude
instances, I would instead build an SMTPreludeStorage
object that can hold these, and make the SMTPreludeStorage
an IStorable
For the record: there is a certain ugly aspect of the current solution:
Say, we want to initialize an array of pointers. In Boogie, we will use this function:
function { :builtin "initToZero_Pointer" } init(a : [$Pointer$] $Pointer$, p : int)
returns (res : [$Pointer$] $Pointer$);
However, the BoogiePreprocessor will convert this function into the following two functions (because the return value is a struct):
function { :builtin "initToZero_Pointer" } init.base(a.base : [int,int]int, a.offset : [int,int]int, p : int)
returns (res : [int,int]int);
function { :builtin "initToZero_Pointer" } init.offset(a.base : [int,int]int, a.offset : [int,int]int, p : int)
returns (res : [int,int]int);
Now in our case this seems acceptable, because we can use the same smt function for both Boogie functions, but in general this does not seem very nice to me.
If anyone thinks we should use another solution, because of this or something else, let me know.
Hm, I did not remember that. I thought that functions can also have multiple return parameters.
One fix could be to also allow pure procedures without implementation instead of the function. Also with the builtin
attribute.
Then we would possibly get
procedure { :builtin "initToZero_Pointer" } init(a.base : [int,int]int, a.offset : [int,int]int, p : int)
returns (res.base : [int,int]int, res.offset : [int,int]int );
modifies()
Had a discussion with @Heizmann :
SMTPrelude
object that is stored separately from the Boogie program is problematic.function { :smtdefined "(store array baseAddress ((as const (Array Int Int)) 0))" }
initializeToZero_mem_int(array : [Pointer] int, baseAddress : int)
returns (res : [Pointer] : int);
We would have a new attribute called "smtdefined", whose argument is an SMT definition that should be put where the Boogie function is used. Probably a good convention is that the argument names in the Boogie function and the SMT code must match. The SMT code could either be used to add a ``(define-fun ..)" command or perhaps it could be inlined directly when translating Boogie to SMT.
Note that both attributes, builtin
and smtdefined
might need an additional solution for dealing with Boogie functions that have a struct return type and thus are split up by the StructExpander.
Questions? Comments? Also note that, Boogie and the transition from Boogie to SMT is not really my expertise..
I also think having self-contained Boogie programs is important. But I think the problem does not change by much if we move the SMT prelude to the Boogie program.
How do we create the term for :smtdefined
?
toStringDirect()
(do we create a new script only for this or do we move the creation of the auxiliary solver to an earlier point in the toolchain?) How do we recover it from the Boogie code?
TermParseUtils
we currently use in our unit testsWhat is the solution for struct return types?
The only other idea I have is extending Boogie s.t. functions can have multiple return arguments (semantically a tuple)
I can help with the Boogie and Boogie to SMT part.
Regarding self-containedness: The benefit I see is that we have a chance of recovering the semantics when the SMT code is in the Boogie program. In my previous proposal it would have been gone after dumping the program to a text file.
Regarding SMT type checking:
Regarding recovering from Boogie:
TermParseUtils
sounds like a good idea. Note that some matching of arguments probably needs to happen. I suppose the arguments are parsed as constants but might need to be variables, maybe some renaming has to happen.Regarding struct return types: The main problem seems to me that for a general solution we might want to have different SMT functions for each entry of the returned struct. So we would need an attribute that allows us to specify several SMT functions. Something like this:
function { :smtdefined "0" "p.offset" } resetbase(p : $Pointer$)
returns (res : $Pointer$);
(The example is just to get the idea across, I do not have a general solution. It might not be correct Boogie syntax either.)
resetbase
would be the function that takes a pointer and returns a pointer whose base hase been set to 0 while the offset is left as is.
Our usecase looks even somewhat harder because the return type is [Pointer] Pointer
.
If anyone has a proposal or comments, please let me know.
Regarding the struct return type problem, the following would be a more complete proposal:
smtdefined
and builtin
directives take as many Strings as arguments as required by the return type (NamedAttribute takes an array of Strings)Example:
function { :smtdefined "(+ p!base 1)" "p!offset" } incrementBase(p : Pointer)
returns (res : Pointer)
expands to
function { :smtdefined "(+ p!base 1)" } incrementBase.base(Pointer p) returns (res : int)
function { :smtdefined "p!offset" } incrementBase.offset(Pointer p) returns (res : int)
If there is not the right number of arguments in the attribute, the Structexpander should probably crash. Typechecking at creation could check the argument count, too.
No one commented on my "lets use procedures without implementation instead" proposal. Is it a bad idea? Why?
I think that would work too, but I cannot see the problem that this solves. In doubt, Functions seem to be more natural since functions may occur in expressions and in SMT-LIB functions also have only one return value.
Regarding the struct problem. I think the proposed solution seems reasonable. An alternative might be to wrap the two annotations explicitly in an annotation that is a directive for the struct expander. Maybe only by using a different keyword that does explicitly mention the struct expander.
But I also have to say that I cannot overlook the whole situation at the moment. How many functions do we need?
Do we need more functions?
Functions that have use cases in the InitializationHandler:
#memory_int
:
function { :smtdefined "(store in0 in1 ((as const (Array Int Int)) 0))" }
~initToZeroAtPointerBaseAddress~int(in0 : [{ base : int, offset : int }]int, in1 : int)
returns (out : [{ base : int, offset : int }]int);
function { :smtdefined "((as const (Array Int (Array Int Int))) ((as const (Array Int Int)) 0))" }
~const-array-int~int~int() returns (out : [int][int]int);
I just commited changes that have code that generates Boogie with smtdefined
as proposed above. Some cases might not work, yet. All of the new functionality should still be deactivated and will not work until changes in Boogie and Boogie to SMT infrastructure have been made. Changes to the use of attributes should still be easy to incorporate, so the discussion is still open..
Ok, after our talk a short reminder for myself.
New attribute :expand_struct "<struct-field-id>"
:expand_struct "<struct-field-id>"
attribute until the next :expand_struct
attribute or no more attributes are attached to the function that is created for the struct field specified by <struct-field-id>
New attribute :smtdefined "<Term>"
function { :smtdefined "<Term>" } <FunId> (...) returns(...)
<FunId>
that takes parameters with matching sorts as the Boogie function, is of sort matching the boogie type, and has <Term>
as definition. function { :smtdefined "(ite (< x y) y x)" } max(x:int, y:int) returns(out:int)
creates a new SMT function symbol (define-fun max ((x Int) (y Int)) Int (ite (< x y) y x))
expand_struct
, the final name of the SMT function is defined by the initial name and the field names of the return type, as the struct expander will create new functions with name <FunId>(.<FieldId>)+
@alexandernutz the attributes should now work and transfer correctly. Have a look at examples\BoogiePL\structexpander\StructFunctionAttributesTest.bpl
for examples.
We have made some progress. Now, the C translation, the StructExpander, and Boogie2Smt should work correctly (according to a few manual tests).
Problem right now: Quantifier elimination for arrays crashes during verification. This might be a job for @Heizmann :-)..
The error message:
java.lang.AssertionError: var is still there: |v_#memory_int_15| term size 5 at de.uni_freiburg.informatik.ultimate.modelcheckerutils.smt.Elim1Store.elim1(Elim1Store.java:452)
The term which it is trying to eliminate from:
(= |#memory_int| (~initToZeroAtPointerBaseAddress~int |v_#memory_int_15| |main_~#a~0.base|))
The function ~initToZeroAtPointerBaseAddress~int
is defined like this:
(store in0 in1 ((as const (Array Int Int)) 0))
The problem can be reproduced as follows:
examples/CToBoogieTranslation/regression/variableInitialization_array_onedim_fixedlength_01.c
Update: to avoid the quantifiers from trace interpolation, I tried out settings using only SMTInterpol. In this case the assertion that checks inductiveness of interpolants fails. I don not know why, yet.
@alexandernutz Can you dump a script for the SMTInterpol case? Perhaps this is a job for @jhoenicke and/or @tanjaschindler
@alexandernutz Could you add a test to the QuantifierEliminationTest
class?
Proceed like the more recent tests and use TermParseUtils.parseTerm(...)
instead of constructing the formula manually.
You mean TermParseUtils.parseTerm(...)
right?
@Heizmann I added two tests as suggested. They are named constantArrayTest01
and constantArrayTest02
. Not 100% sure if they make sense like this but they should... -- feel free to let me know..
@danieldietsch @jhoenicke @tanjaschindler here you go, the below file produces
(error "generated interpolants did not pass sanity check")
when I run SMTInterpol (current master branch on github) on it
debug.zip
@alexandernutz in this class is a special new method for quantifier elimination that does not make sense to me. You were using that method. I modified that call. https://github.com/ultimate-pa/ultimate/commit/c854ea4e6ba0faf6819f8844583af41d0f757886 Now the first test passes and the second test fails because the quantified variable is the argument of a function application. My partial quantifier elimination does not support that case. (And I am not willing to change that unless the change helps)
I think, if we would replace ~initToZeroAtPointerBaseAddress~int
by the function body, my partial quantifier elimination would eliminate the quantifier.
Question to @alexandernutz and @danieldietsch: Could we inline this (use defined) SMT function immediately?
Currently, we call defineFunction
for a new script instance and define all smtdefined function symbols that way.
You can access all functions defined that way via CfgSmtToolkit
and its SmtSymbols getSmtSymbols()
method.
I do not know how feasible it would be to require an SmtSymbols
instance for the quantifier elimination, but if this is no issue you could substitute all the function applications with their bodies.
Although, I am not sure if we have some kind of SMT function inliner. The normal substitution will probably not work, or at least be somewhat inefficient.
If you think that this would be a good way forward, I could look into it.
If possible, we should inline the function immediately.
We could do it during the construction of the Icfg. But the backtranslation will be difficult. Do you have an idea how we could do it there? Then you could explain it to me and I can implement it.
I see, that is a problem. But I have no idea for the backtranslation. I think we have to look at individual expressions that occur in practise and think about solutions.
I updated SMTInterpol in 5d08986750682c75e39ad1f92180c9582722463d -- this should fix the error you reported @alexandernutz
I think this feature is now implemented and all follow-ups should get their own ticket.
Improve the Boogie code that is generated from an array initialization in the C program (by default or via initializer).
Example: int a[10][10][10] = {1, 2}; will be translated to 1000 boogie statements each one initializing one array cell. 998 of them initialize to 0.
Plan for initializing an array:
I believe we should improve step 1. Several approaches come to mind:
(I did not find a corresponding issue, let me know if this is a duplicate)