Closed claudemarche closed 2 years ago
I spent some time to think about all of this, and I am afraid that it may not be as simple as we expected. In particular:
The vast majority of examples in circulation would remain correct
I am not sure of this anymore, because of the assigns
clause. For example:
struct X { int x; };
/*@ requires \valid(x);
@ assigns *x ; */
void function(struct X* x);
int main(void){
struct X x = { 0 };
function(&x);
//@ assert A: \valid(&x);
}
Here, everyone expects &x
to remain valid after calling function
, thus to prove A
. But with the new semantics, it might not be the case because *x
is assigned (and since it is a structure it is possible to "uninitialize" it). And the reason is that \valid(p)
now talks about both p
and the initialization status of the pointed value (thus a property of *p
). Thus an assigns
on *p
might change a property related to p
and I do not know if I like that.
Thus for a quite a lot of existing contracts, we should now specify in post-condition that some pointers remain \valid
after the execution of the function.
(for now let us say that reading an uninitialized value is undefined behavior)
The main problem is that in the majority of examples that exist so far, e.g. in the ACSL by example document, Allan Blanchard tutorial, or the gallery of verified programs of Toccata, the \initialized predicate is not used, and \valid is used as it was meaning \valid(p) && \initialized. In a strict interpretation of \valid all these examples are in fact incorrect, even if they were proved correct by Jessie or by WP.
In fact, it may not be a problem as long as analysis hypotheses are clear. For me it tends to say that we can treat functional correctness and absence of runtime errors separately. Let us see for example the swap
function:
/*@ ensures *a == \old(*b) && *b == \old(*a); */
void swap(int* a, int* b){
int tmp = *a ;
*a = *b ;
*b = tmp ;
}
For me, without proving absence of runtime errors, this ensures
must be verified correct by a deductive verification tool. But for now the absence of runtime errors has not been verified. The property is true with the hypothesis that the program does not produce any runtime error. Thus, here, we assume that:
If one now needs to be sure that the program does the right job according to C, we have to add the verification of absence of runtime errors. Thus, with the following requires
:
/*@ requires \valid(a) && \valid(b) ;
We know that the program is verified, still assuming that:
Thus current examples aren't incorrect, they are verified assuming the hypothesis.
Quite a lot of predicates are in fact just there to explicit hypotheses that were made in the past. This is somewhat the case of object_pointer
. That could be the case for a potential \aligned_access
or \right_object_type
(strict aliasing), etc. Adding these predicates would not make examples incorrect: their existence just highlight existing hypotheses.
Because in fact the example assumes that:
Thus for me the question is more on how we should consider unspecified behaviors/values. And I tend to say that a C value in ACSL annotations is equivalent to its actual value in the program if this value is not an indeterminate value. And I believe that it is indeed ab assumption that all analyzers have.
Concerning you compatibility point: I believe my proposal should be refined: my naive interpretation of \valid_read(p)
when p
is of type struct S *
is that it is permitted to read and that all fields are initialized. So to be precise, my proposal should not be that \valid_read(p)
means \read_permitted(p) && \initialized(p)
but \read_permitted(p) && \initialized(&(p->f1)) && ... && \initialized(&(p->fk))
enumerating all fields of S
.
But that's a minor technical point.
Concerning the other point: so your proposal means to me that the conformity of a code w.r.t. an ACSL specification should not include the safety of its executions. In other words, I should write in the ACSL chapter that a function conforms to its contract when for any program state S1 satisfying the pre-condition, for any safe execution of its body starting from state S1, and that terminates, the resulting state S2 satisfy the post-condition.
That's quite a weak definition to me, but that's a meaningful choice to make.
One funny consequence though, is that we don't need to put anymore the \valid_read
or \valid
preconditions we usually state. Disturbing at least. That would need to clearly say proving conformity of the code to a contract and proving safety of execution are two clearly distinguished objectives.
I'll need to think more about it, but it makes sense to me.
Concerning you compatibility point: I believe my proposal should be refined: my naive interpretation of \valid_read(p) when p is of type struct S * is that it is permitted to read and that all fields are initialized. So to be precise [...]
I am not sure that I understand. Here:
struct X { int field ; };
/*@ requires \valid(x);
@ assigns *x ; */
void foo(struct X* x){
struct X tmp[1] ;
*x = tmp ;
}
There is no RTE, still *x
is not "initialized" anymore (nor x->field
that now holds an indeterminate value). Thus with the proposed semantics, x
would not be valid (nor valid read) anymore.
In other words, I should write in the ACSL chapter that a function conforms to its contract when for any program state S1 satisfying the pre-condition, for any safe execution of its body starting from state S1, and that terminates, the resulting state S2 satisfy the post-condition.
That's quite a weak definition to me, but that's a meaningful choice to make.
It is the definition of the Weakest Liberal Precondition and I don't see a better choice.
But for the terminates
clause, I can't accept the following result: the C function complies with the terminates
clause of its contract if the execution of that C function terminates.
I would like to prove a save termination.
There is no RTE, still
*x
is not "initialized" anymore (norx->field
that now holds an indeterminate value). Thus with the proposed semantics,x
would not be valid (nor valid read) anymore.
OK, I missed your point. only \write_permitted would be preserved, not \valid.
It is the definition of the Weakest Liberal Precondition and I don't see a better choice.
It is indeed the WLP. But there is a better choice to me, which is the blocking semantics. WLP is an old defintion that is not well-suited for defining the validity of code assertions. For example in a code like
int f() {
... some code ...
//@ assert P;
while (1) {};
}
the WLP is unable to define what would it mean for the assertion to be valid.
So for me a better definition is : in a state where the precondition is valid, the body executes without blocking (which implies that any code annotation met is valid) and if it terminates the post is valid (which is indeed also covered by "non blocking")
The subtility that I didn't want to consider is the the distinction between blocking on a code annotation and blocking on a C statement (like division by zero of non-permitted memory access). I believe I'll need to distinguish those in my part of the ACSL chapter.
Summary : I understand that my proposal goes to far since it would require, for a code to be conforming with its annotations, to be free of run-time errors and undefined behaviors. Although this would be an ideal situation, and is usually what one proves when proving code in Java, Ada, Rust, etc., it is too much to ask for C code.
I still believe that the names \valid
or \valid_reads
are poor choice names, something like \write_permitted
and \read_pernitted
would have been better, but difficult to change.
I'll close this ticket.
In short: I propose that
\valid(p)
and\valid_read(p)
include in particular\initialized(p)
.This issue is related to issues #74 and #81.
In much more details: historically, the predicate
\valid(p)
is inherited from Caduceus. Its informal meaning, or informal intention, was to be the natural pre-condition for accessing to the value pointed byp
, either by reading (*p
) or writing (*p = ...
). It was included as is in ACSL from the beginning. Later on, the variant\valid_read(p)
was introduced, with the intention to be the natural pre-condition for access but not allowing writing, which is the typical situation when the pointer points toconst
data. Much later on, the predicate\initialized
was introduced because of a need to distinguish between data that is allocated but not yet initialized, and initialized data. reading non-initialized data is not forbidden by the C norm, but the result is unspecified, and in particular there is no guarantee that reading twice the same non-initialized memory location will give the same value.The main problem is that in the majority of examples that exist so far, e.g. in the ACSL by example document, Allan Blanchard tutorial, or the gallery of verified programs of Toccata, the
\initialized
predicate is not used, and\valid
is used as it was meaning\valid(p) && \initialized
. In a strict interpretation of\valid
all these examples are in fact incorrect, even if they were proved correct by Jessie or by WP.This annoying situation is hot nowadays, not only because of the issues #74 and #81, but also because of the writing of the ACSL chapter of the Frama-C book, in which a clear meaning for
\valid
is expected.The proposal is then:
to rename the current predicate
\valid
and\valid_read
into new names such as\write_permitted
and\read_permitted
.to give to
\valid(p)
the new meaning\write_permitted(p) && \initialized(p)
and tovalid_read(p)
the meaning\read_permitted(p) && \initialized(p)
.Pros
Cons
\valid
into\write_permitted
, and\valid_read
intoread_permitted
. I assumed it should be only a renaming, feel free to correct me.\write_permitted
, whereas for memory access it should probably remain\valid_read
.\initialized
could be removed because now subsumed by a\valid
.This was discusses longly this morning with Virgile and Allan, I probably forgot other arguments discussed, let's what comes in the upcoming discussion. Any comments welcome! @all