Open vprevosto opened 3 years ago
I have the feeling that another program point is important, which is before lval = retval
and after closing the block that allocated the formals. I have the feeling that the real choice for Post is before affecting the target lval
which is actually unknown when proving the function in a modular setting.
What distinguish this point from Post2
is the allocation of formals, although \result
still have an address.
// Pre1
{
rettype retvar;
// Pre2
{
prms = args;
// Pre3
{
body;
retvar = r;
// Post1
}
// Post2
}
// Post2bis
lval = retvar;
// Post 3
}
Pre3
: formals are allocated and initialized and result is allocated but not initializedPost2
: formals and result are allocated (and initialized)Post2bis
: formals are not allocated and result is allocatedPost 3
: returned value is stored in caller's contextStrictly in terms of what is allocated, Post2bis
and Post3
are equivalent (they're in the exact same block). But indeed it makes more sense to put the label right after closing the inner block.
that said, I think there's a typo in the code with respect to the position of the Post*
labels: in the current setting, in both Post1
and Post2
everything is allocated, while in Post3
retvar
is still allocated. The original code template should read:
// Pre1
{
rettype retvar;
// Pre2
{
prms = args;
// Pre3
{
body;
retvar = r;
}
// Post1
}
// Post2
lval = retvar;
}
// Post3
Nice !
However, I'm afraid that, even if you place Post at Post3
(in either snippets) instead of Post2
or Post2bis
, you shall not replace \result
with lval
in the post-conditions : what if lval
is aliased in the post ?
//@ ensures \result == \old(*p) && *p == 0; assigns *p;
int f(int *p) { int v = *p ; *p = 0; return v; }
}
void main(void) {
int x = 1;
A: x = f(&x);
//@ check Subst: x == \at(*&x,A) && *&x == 0;
}
And more generally, if you put Place at Post3
, I don't know how you can prouve f
in a modular way.
Indeed, we can't do this substitution. On the other hand, this is strictly a problem of the caller. Hence, I do not understand the issue with f
. With main
, on the other hand, we indeed have a problem. But I'm afraid it is in fact deeply ingrained in the C standard, and more precisely 6.5.2.2§10:
There is a sequence point after the evaluations of the function designator and the actual arguments but before the actual call. Every evaluation in the calling function (including other function calls) that is not otherwise specifically sequenced before or after the execution of the body of the called function is indeterminately sequenced with respect to the execution of the called function
There's no sequence point in main
between A
and the ;
after the call to f
. It could thus be argued that a compiler smart enough to detect that f
will return the value contained in x
could choose to make the assignment in main
before the call to f
, potentially optimize it away since it would amount to x=x
, and only after that perform the side effects of f
itself, putting 0
into x
.
Yes indeed, there is actually an undefined sequence hidden behind the scene with a temporary local variable to hold the result. Something like the following pseudo-code (assuming Post is in Post2):
int tmp ;
__undefined_sequence__ {
tmp = f(prms);
//@ assert Ensures( args <- prms , \result <- tmp );
lval = tmp;
}
Regarding the issue with f
, I just mean that if Post
is placed in Post3
, when proving the ensures
of f
you can not figure out what \result
would be unless you precisely know what lval
is, hence you can not perform modular proofs.
Actually it's not "undefined" (in the dreaded "undefined behavior" :fearful: sense), but "indeterminately sequenced", meaning that after the ;
we can only end up in one of the two situations, either x == \at(x,A)
or x==0
.
But this is still an issue of course 😁
In 3 of your previous code schemes, there is the assigment prms = args;
. But, in order to remove any ambiguity, where do you place the declaration prmstype prms;
?
at the same place (it was meant to represent an initialization rather than a simple assignment).
A short example illustrating most of the problems to be discussed :
/*@
requires InValid: \valid(&p);
requires InInit: \initialized(&p);
ensures OutValid: \valid(&p);
ensures OutInit: \initialized(&p);
ensures RetValid: \valid(\result);
ensures RetInit: \initialized(\result);
*/
struct S *callee(struct S p) { return &p; }
void caller(struct S p) {
struct S * r ;
r = callee(p);
//@ assert WRONG: \valid(r) && \initialized(r);
}
If we agree on Post
meaning Post2
(in the sense of this comment), OutValid
and RetValid
do not hold (I won't commit to the status of OutInit
and RetInit
, as the initialization status of an invalid location is dubious at best).
@claudemarche for information
I'm not a great fan of that choice because the ensuresclause wouldn't give an external view of the specified function. How can you specify
free` function ?
Did I miss something?
I guess the reason why @vprevosto notifies me this thread is the oral discussion we had last Wednesday about the ACSL chapter of the FC book, where the understanding of the meaning of ACSL specifications is a key. I'm planning to give a semi-formal explanation of the intended meaning of the validity of a contract (in terms of a blocking semantics a la Paolo Herms), that should at least say some non trivial and important things about the parameters of a function. In essence, one should have in mind that the parameters have a meaning in the pre-state of a call, but not in the post-state. This choice seems good to me, and is consistent that even if such a parameter is mutated in the function's body, this mutation is not visible to the caller, and the "final" value is inaccessible to it. It also explains well why the kernel adds \old around all parameters when they are mentioned in post-conditions.
Your example adds some difficulty due to the dreadful address operator (the general advice being: use it only when you are sure you know what you are doing...). Taking the address of a parameter means that the allocation and deallocation of the memory cells where the parameter values are stored become "visible" in the ACSL specifications... Too bad for the good old principle of abstraction, sigh... In that case it seems to me that we should say that the pre is supposed to hold after allocation of the parameters, and the post is supposed to hold after their deallocation. I'm not sure your presentation as a ``program transformation'' is really appropriate but let me try:
rettype retvar; // allocates retvar
... prms; // allocates parameters
prms = args; // initialize parameters
// pre holds here I think
body;
lval = retvar;
// deallocates retvar and prms
// post holds here I think
so in an example like
//@ ensures \valid(\result);
struct S *callee(struct S p) {
struct S *tmp = &p;
//@ assert \valid(tmp);
return tmp;
}
the assert is valid but not the ensures.
Hope it is quite clear and that it may helps...
Thank @claudemarche, I now see what I missed; the proposed choice is Indeed a good one.
ACSL manual states that (section 2.3.1)
However, this leaves out some important questions, notably whether we can speak about the address of such formal parameters. Basically, given a function
rettype f(prms) { body; return r; }
, a calllval = f(args)
can be decomposed as such (we assume that the body off
is normalized and there's a singlereturn
statement)EDIT as mentioned in a comment below, the initial
// Post
labels were incorrectly placed.In this context, there are three places where the
Pre
(as in\at(.,Pre)
) state off
could be located.Pre2
is not really useful: it only give access to the location where the returned value will be temporarily placed, and by definitionretvar
is uninitialized until thePost
state. EquatingPre
withPre1
implies that we cannot take the address of the formals in the pre state, and notably that it is not possible to state something like\initialized(&prm)
. Basically, therequires
off
would be translated atPre1
asassert R{prms <- args}
where{prms <- args}
denotes the substitution ofprms
byargs
. EquatingPre
withPre3
implies that we have addresses attached to the formals.Similarly, equating
Post
withPost1
implies that we can speak about the addresses of the formals, and ofretvar
. In particular, ifp
holds the address of a formal,ensures \valid(p);
will be true. This is not the case if we equatePost
withPost2
orPost3
, but in the former case it might be possible to refer to the adress of\result
(so that it would be possible to stateensures \initialized(&\result)
, e.g. if we return astruct
and want to state explicitly that each field has been initialized. Again, theensures
off
would be translated asassert E{prms<-args, \result<- retvar}
(or\result <- lval
forPost3
).At a quick glance, it would seem that equating
Pre
withPre3
andPost
withPost2
orPost3
would be preferable, but there may be some other subtleties than\valid
and\initialized
lying around.