rems-project / cerberus

Cerberus C semantics
https://www.cl.cam.ac.uk/~pes20/cerberus/
Other
48 stars 25 forks source link

[CN] Meta-issue: syntax overhaul #304

Open dc-mak opened 3 months ago

dc-mak commented 3 months ago

A few existing issues have touched upon the current syntax. The solutions for those bring function and predicate a lot closer together.

  1. Remove if-restriction (https://github.com/rems-project/cerberus/issues/266)
  2. Remove predicate pointer first restriction (https://github.com/rems-project/cerberus/issues/303)
  3. Unify namespaces for functions and predicates (https://github.com/rems-project/cerberus/issues/288)
  4. Type annotations for expressions (https://github.com/rems-project/cerberus/issues/305)
  5. Smarter unfolidng for functions #483

Though not strictly dependent on the above, it may be sensible to wait until they are finished for the next tasks (which are closer to nice-to-haves).

  1. Add base type heap and disallow that to be used in arguments or in data types 5a. Add new unified syntax and map to existing backend 5b. Optional but good engineering: refactor backend to match new sytnax

As mentioned in https://github.com/rems-project/cerberus/issues/288, at the end of this we could write things like

i32 add_one(i32 x) {
  let y = x + 1;
  y;
}

heap<i32> intQueueAux(i32 x) {
  return add_one(x);
 }

heap<i32> intQueue0(i32 x) {
  intQueueAux(0);
}

heap<i32> intQueue(i32 x) {
  let y = add_one(x);
  take q = intQueueAux(y);
  return q;
}

Pros:

Cons:

i32 add_one(i32 x) {
  return  x + 1;
}

heap<i32> intQueueAux(i32 x) {
  return heap(add_one(x));
}

heap<i32> intQueue(i32 x) {
  return intQueueAux(0);
}

heap<i32> intQueue(i32 x) {
  let y = add_one(x);
  let q = *intQueueAux(y); // This is redundant and for illustration only; desugars to --
  return heap(q);          // take _genvar = intQueueAux(y); let q = _genvar; return heap(q)
}
bcpierce00 commented 3 months ago

One more:

On Wed, Jun 5, 2024 at 8:01 AM Dhruv Makwana @.***> wrote:

A few existing issues have touched upon the current syntax. The solutions for those bring function and `predicate a lot closer together.

  1. Remove if-restriction (#266 https://urldefense.com/v3/__https://github.com/rems-project/cerberus/issues/266__;!!IBzWLUs!WrxvcyM3niw6bzUcqm6-Ie4GOUQSynOR3z6wU7H4tQ3kLFlYzo0A3OgxEQikoeBN1lwPcMBP3UKDw47Z9u4Oo2pOtPdI$ )
  2. Remove predicate pointer first restriction (#303 https://urldefense.com/v3/__https://github.com/rems-project/cerberus/issues/303__;!!IBzWLUs!WrxvcyM3niw6bzUcqm6-Ie4GOUQSynOR3z6wU7H4tQ3kLFlYzo0A3OgxEQikoeBN1lwPcMBP3UKDw47Z9u4Oo-UGJ1EW$ )
  3. Unify namespaces for functions and predicates (#288 https://urldefense.com/v3/__https://github.com/rems-project/cerberus/issues/288__;!!IBzWLUs!WrxvcyM3niw6bzUcqm6-Ie4GOUQSynOR3z6wU7H4tQ3kLFlYzo0A3OgxEQikoeBN1lwPcMBP3UKDw47Z9u4Oo1tr5Hzx$ )

Though not strictly dependent on the above 3, it may be sensible to wait until they are finished for the next tasks (which are closer to nice-to-haves).

  1. Add base type heap and disallow that to be used in arguments or in data types 5a. Add new unified syntax and map to existing backend 5b. Optional but good engineering: refactor backend to match new sytnax

As mentioned in #288 https://urldefense.com/v3/__https://github.com/rems-project/cerberus/issues/288__;!!IBzWLUs!WrxvcyM3niw6bzUcqm6-Ie4GOUQSynOR3z6wU7H4tQ3kLFlYzo0A3OgxEQikoeBN1lwPcMBP3UKDw47Z9u4Oo1tr5Hzx$, at the end of this we could write things like

i32 add_one(i32 x) { let y = x + 1; y; }

heap intQueueAux(i32 x) { return (add_one(x)); }

heap intQueue0(i32 x) { intQueueAux(0); }

heap intQueue(i32 x) { let y = add_one(x); let* q = intQueueAux(y); return y; }

Pros:

  • Concise.
  • Generics are familiar to many (and monads familiar to theorists).
  • Make polymorphism easier to add and explain later.
  • Makes the type of locks easier to add and explain later.

Cons:

heap intQueueAux(i32 x) { return (heap(add_one(x))); }

heap intQueue(i32 x) { return intQueueAux(0); }

heap intQueue(i32 x) { let y = add_one(x); let q = *intQueueAux(y); return heap(q); }

— Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/rems-project/cerberus/issues/304__;!!IBzWLUs!WrxvcyM3niw6bzUcqm6-Ie4GOUQSynOR3z6wU7H4tQ3kLFlYzo0A3OgxEQikoeBN1lwPcMBP3UKDw47Z9u4Oo1dJoef8$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ABVQQC2V6YY3QNIU2OANNALZF342HAVCNFSM6AAAAABI2R66LOVHI2DSMVQWIX3LMV43ASLTON2WKOZSGMZTKNZRG42TQMI__;!!IBzWLUs!WrxvcyM3niw6bzUcqm6-Ie4GOUQSynOR3z6wU7H4tQ3kLFlYzo0A3OgxEQikoeBN1lwPcMBP3UKDw47Z9u4OoxPdRZbJ$ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

dc-mak commented 3 months ago

Added, thanks

peterohanley commented 1 month ago

Are the type annotations in declaration specs ever not fully determined by the declaration? If not, why have a separate form for the declaration spec at all? Concretely:

//before
int foo(int x, const char *y, struct bar **z);
/*@ spec foo(i32 x, pointer y, pointer z);
    requires ...; // hardcoded in grammar
    ensures ...; // hardcoded in grammar
@*/

//after
int foo(int x, const char *y, struct bar **z);
/*@ spec foo(x, y, z); // added to function_spec_item grammar node
    requires ...; // just another function_spec_item
    ensures ...; // just another function_spec_item
    trusted; // parses for free now
    accesses qux; // also parses for free
@*/

The arguments could also be removed. I like concretely bringing them into CN scope in the same block.

cp526 commented 1 month ago

Are the type annotations in declaration specs ever not fully determined by the declaration? If not, why have a separate form for the declaration spec at all? Concretely:

//before
int foo(int x, const char *y, struct bar **z);
/*@ spec foo(i32 x, pointer y, pointer z);
    requires ...; // hardcoded in grammar
    ensures ...; // hardcoded in grammar
@*/

//after
int foo(int x, const char *y, struct bar **z);
/*@ spec foo(x, y, z); // added to function_spec_item grammar node
    requires ...; // just another function_spec_item
    ensures ...; // just another function_spec_item
    trusted; // parses for free now
    accesses qux; // also parses for free
@*/

The arguments could also be removed. I like concretely bringing them into CN scope in the same block.

The reason for this aspect of the spec syntax is that C does not require a function prototype to name all the arguments; and if it does, the Cerberus frontend (IIRC) forgets them. So spec has syntax to introduce names for the arguments. AIUI, originally the idea of spec was also to be able to give specifications to functions in a different source location from their declaration.

thatplguy commented 1 week ago

There are enough spec-related tickets (you can see a list of them here) that I've lost track of the broader language design issues we're considering. Here's a rough grouping of the current tickets.

Predicates

Taken together, these issues address two broad points:

  1. Conceptually, the term "resource predicate" is confusing, both to those familiar with program verification and newcomers. We should find a name for this concept that's more intuitive and use it in the docs/syntax/etc.
  2. The syntax around defining and using resource predicates currently imposes certain restrictions that we think can be alleviated. If we make all the changes above, we'll end up with less distinction between resource predicates and logical functions – which is good, because it means there are fewer special cases to learn and remember.

Syntactic sugar

Local vs global reasoning (and library models)

These issues are related to how specifications are associated with functions, and how/when specs are trusted vs. verified by CN. There are two use cases to keep in mind:

  1. Defining library models by creating trusted function specifications for functions without definitions.
  2. Defining specifications for exported functions (i.e. declared in a .h file) without leaking internal state into the external spec (e.g. about non-exported static variables).

Solver hints

Types

Datatypes