rems-project / cerberus

Cerberus C semantics
https://www.cl.cam.ac.uk/~pes20/cerberus/
Other
49 stars 25 forks source link

[CN] Misleading warning on necessary use of `extract` #498

Open septract opened 1 month ago

septract commented 1 month ago

Problem: CN generates warnings implying that extract is redundant, when in fact the extract is necessary for the proof.

Example code (from a discussion with our friends at BAE):

struct QueueStruct {
  int front;
  int back;
  int* queue;
};

struct QueueStruct push_var(int var, struct QueueStruct q)
/*@ 
requires 
  0i32 < q.back; q.back < 100i32; 
  take QueueStart = each (i32 j; 0i32 <= j && j < q.back) {Owned<int>(q.queue + j)}; 
  take QueueEnd   = each (i32 j; q.back <= j && j < 100i32) {Block<int>(q.queue + j)}; 
ensures 
  take QueueStart_ = each (i32 j; 0i32 <= j && j < return.back) {Owned<int>(q.queue + j)};
  take QueueEnd_   = each (i32 j; return.back <= j && j < 100i32) {Block<int>(q.queue + j)};
  q.back < return.back;  
@*/
{
  /*@ extract Block<int>, q.back; @*/
  q.queue[q.back] = var; 
  /*@ extract Owned<int>, q.back; @*/ // <-- Proof fails without this line 
  q.back++; 
  return q; 
}

CN produces this warning:

src/example-archive/simple-examples/inprogress/bae-queue.c:58:7: warning: extract: index added, no resources (yet) extracted.
  /*@ extract Owned<int>, q.back; @*/
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~ 

This is very misleading!

I think the reason this happens is that extract keyword does two things in CN:

  1. It tries to instantiate a resource at a particular offset, in this case at offset q.back
  2. It tries to cast it to the correct type, in this case Owned<int>

If you try to prove this code, the 1st step does not occur, but the 2nd step does. That is, extract takes the Block<int> resource, and casts it to the Owned<int> resource.

Suggested resolution:

CN should correctly log that extract has up-cast a resource, and suppress the misleading warning.

(Longer term, we should automatically infer most or all uses of extract).

dc-mak commented 1 month ago

The way that extract works is that it marks a particular index as valid to move in or out of the iterated resources of the matching type in the context. I thought "movable" would have been clearer a while back but I failed to convince others.

The actual issue with the warning here is a bit silly. Usually we say extract because we want to break an element off an array. But in this case we need to say extract because we need to glue it back on, to a different array & type it came from.

AFAIR, the warning checks for the first case but not the second, and this confusing example should disappear if it also checked for the second. There is no up/down casting of resources going on - it's literally just that "Block" means "may be uninitiatialised, so only write, no reads" and once you write to a block you know you are definitely initialised and so you may read and write "Owned".

We used to infer indexes, and it lead to a complicated and slow inference scheme. I agree we can and should infer more, but it's an area of research. Having used both the old and current, I can say that the predictability and speed of this is much more preferable especially in terms of explaining (I can show you a formalisation of the old inference rules!).

It was also the old scheme of inferring indices which required the first argument of every predicate to be a pointer.

TL;DR

Extract does not move resources, but marks them as movable in or out of iterated resources. Currently the code is "if resource not moved out then warn", but it should be "if neither resource moved out OR moved in then warn".

cp526 commented 1 month ago

The way that extract works is that it marks a particular index as valid to move in or out of the iterated resources of the matching type in the context. I thought "movable" would have been clearer a while back but I failed to convince others.

...

The actual issue with the warning here is a bit silly. Usually we say extract because we want to break an element off an array. But in this case we need to say extract because we need to glue it back on, to a different array & type it came from.

AFAIR, the warning checks for the first case but not the second, and this confusing example should disappear if it also checked for the second. There is no up/down casting of resources going on - it's literally just that "Block" means "may be uninitiatialised, so only write, no reads" and once you write to a block you know you are definitely initialised and so you may read and write "Owned".

We used to infer indexes, and it lead to a complicated and slow inference scheme. I agree we can and should infer more, but it's an area of research. Having used both the old and current, I can say that the predictability and speed of this is much more preferable especially in terms of explaining (I can show you a formalisation of the old inference rules!).

It was also the old scheme of inferring indices which required the first argument of every predicate to be a pointer.

TL;DR

Extract does not move resources, but marks them as movable in or out of iterated resources. Currently the code is "if resource not moved out then warn", but it should be "if neither resource moved out OR moved in then warn".

The CN tutorial has an issue where we've been discussing renaming extract to something more useful ( https://github.com/rems-project/cn-tutorial/issues/51 ). Benjamin suggested focus, which I think we should go with

I recorded that as a new issue also in the Cerberus repository ( https://github.com/rems-project/cerberus/issues/499 ).

Regarding the warning: the warning ocasionally provides useful output, but I'm tempted to remove it altogether, because it doesn't reflect well how extract actually works. One can put extract at the start of a function body to indicate that CN should feel free to extract and re-inject resources at that index any time from now on. So it's not a one-time action (that can easily be warned about) but a hint to CN to consider this index interesting.

The reason for making extract work like this, rather than having extract be a one-time action, is that this can save a lot of annotations when indices are repeatedly extracted and re-injected.

lwli11 commented 1 month ago

Thanks for looking at this. I was also puzzled with the second extract Owned was after the indexing statement.

septract commented 1 month ago

Thanks @dc-mak and @cp526 - this is indeed rather subtle, and I wish we could avoid the need for it in proofs.

Regarding the warning: the warning ocasionally provides useful output, but I'm tempted to remove it altogether, because it doesn't reflect well how extract actually works. One can put extract at the start of a function body to indicate that CN should feel free to extract and re-inject resources at that index any time from now on. So it's not a one-time action (that can easily be warned about) but a hint to CN to consider this index interesting.

I am not sure about this. I often make mistakes when writing extract, eg. passing a value of the wrong type / index. It's useful to know that a particular extract did nothing in the proof, although figuring out why is often fiendishly difficult (to me). So I would really like a better warning message that helps me get this right, rather than just silently doing nothing.

septract commented 1 month ago

It seems to me like often the correct argument to extract is available as a term in the proof context, but it's just not obvious what should be used. So perhaps there is a need for a failure recovery tool that says something like:

Error: Resource not available, but it would have been if you added `extract <whatever>`

This seems better than adding additional proof search to figure out extract on the fly, because it only triggers once the proof has failed.

cp526 commented 1 month ago

@septract One very easy change that would obviate extract in typical examples is to automatically add an index to the list of "interesting" indices whenever it's used in a Core array-shift operation, which is whenever the user writes p+i or p[i] in C. The rest of the machinery could continue to work unchanged.

As I recall, one concern you previously mentioned about this scheme was that, if users are not used to regularly doing extract, not even on easy examples, then figuring out the extract statements in those harder examples where it's not automatable is even harder, which is something worth considering.

I haven't thought much about what the gaps in the automation would be, though.

septract commented 1 month ago

@cp526 Yes, that's a good point too. But now that we are seeing that extract is a significant pain point, perhaps it's worth implementing this as an optional mode?

cp526 commented 1 month ago

I've looked into this briefly. There's a two-line change that would implement what I described, but it creates an awkward gap between Owned+Block (automatically inferred extract in the typical case) and iterated user-defined predicates (no automation).

The right way to do it, I think, is to link it with your earlier suggestion of giving extract the base pointer as an extra argument. Once we do that, extract doesn't need to take in the predicate name as an argument and can work uniformly for Owned and user-defined predicates. That's not super complicated to implement either (but a bit more work than the above), so we should try that.

dc-mak commented 3 weeks ago

This is the issue: https://github.com/rems-project/cerberus/issues/311 Should I close this one?

cp526 commented 3 weeks ago

This is the issue: #311 Should I close this one?

The original issue here is about the warning, not the same as #311, so let's leave this one open.

dc-mak commented 3 weeks ago

Also related: #471, #361, #357, #320, #311, #256