Closed nikomatsakis closed 1 year ago
cc @rod-chapman
cc @xldenis
Meeting tentatively scheduled for Dec 7
I'd love to participate, let me know if I can help with prep work at all.
@xldenis I'll send you a meeting invite. What is your availability?
In December I'm pretty available right now, except for the 14th, I'm in Paris (GMT+1) but I can make things work up to ~19h local time.
I'd like to participate as well. I'm in Paris as well (GMT+1): any time up to 7pm should be ok as well.
December 7th is fine for me. I should be available any working day between December 1st and 16th, except for the 2nd and the 14th.
I'd be very happy to participate; I'm sure others from the Prusti team would be very interested too!
Super - very pleased to have you join in.
We have a draft proposal on hackmd https://hackmd.io/w8AS2N09R76aXDFK9ogHBQ?view
Any comments/edits/feedback greatly appreciated.
We have a draft proposal on hackmd https://hackmd.io/w8AS2N09R76aXDFK9ogHBQ?view
Any comments/edits/feedback greatly appreciated.
Thanks for starting this! I've been thinking about this topic for a while and would love to be involved. Is it possible to attend the meeting?
Here are a few comments about the current proposal (HackMD's signup currently isn't working for me):
To improve usability for the average user, I believe the contract language should be a subset of Rust (and its scoping). Each verification tool can decide what subset of Rust it considers valid. The proposal could add operators like implication (==>
) to Rust to make writing contracts easier. You could also define contract functionality as normal Rust functions that are known to have special semantics. For example, you could define old
at the Rust level as:
fn old<T>(_:&mut T) -> &mut T {
panic!{"`old` may only be used in contracts."}
}
In Liquid Haskell, you can lift pure, terminating Haskell functions to the refinement/contract language using reflection. You could potentially indicate this using an annotation like #[reflect]
. Liquid Haskell also checks termination on all functions by default. The user can manually disable the termination check on a function with the lazy
annotation. It might make sense to include discussion about reflection and termination in the document.
To statically check "Pointer Validity Intrinsics", you could potentially use ghost variables.
Are you thinking that the contract language would be implemented as library/compiler plugin? Then each verification tool can use the library/compiler plugin to parse the contract language and receive the AST for the contract language and any additional metadata. I'd be happy to contribute to the implementation of this.
It'd be great if there is a standard mechanism to ship verification results with packages. This would probably include information like the contracts checked, the verification tool name, any tool specific proof certificates, and whether it was statically verified, fuzzed, dynamically checked, etc. For example, Flux might want to ship the SMT LFSC proof so that it can quickly recheck the validity of package functions.
Thanks to everyone that has jumped in with comments and ideas so far! Keep 'em coming!
I gave this comment to Rod directly, but he suggested I provide it here.
I see you are using the term "invariant" in the context of "predicated subtypes". I think that this might create some confusion, in that "invariants" are normally only guaranteed true on entry and exit to the operations that define the semantics of (all values of) an abstract data type, while "predicates" represent a subsetting condition that identify a subset of the values of a given type that a given object or parameter is expected to hold. Confusing these two notions can cause trouble later, based on my experience with Ada 2012.
For example, you can define notions like "range subtypes" (e.g. Positive) using predicates, whereas you would use an invariant with an abstract data type to define properties that, if assumed on entry to an operation that mutates a variable of a given ADT, can be proven true again upon exit. Quite frequently, invariants are used as part of proving that a data structure is implemented correctly, and need not even be visible to the clients.
For example, if an array is used to implement a stack, an invariant might be that, if the logical stack is currently N items deep, then the first N elements of the array have been initialized. Essentially, a "type invariant' is really just a shorthand for sprinkling a matching pre- and postcondition on every operation that creates or modifies values of the type, while a "predicate" is a way of subsetting a type by specifying a property that typically some, but not all, values of the type might have. And as mentioned, an invariant might often be internal to the implementation of the abstract data type, while a predicate is generally defined by some visible property that values of a given type might have.
What you describe as 'invariant' matches exactly the 'safety invariant' in Rust. (Though when you consider concurrency, as we have to in Rust, just requiring these invariants to hold at the entry and exit of methods is not always enough. But that is a separate discussion.)
We also have a 'validity invariant' though, reflecting what the compiler assumes to always be true about elements of a given type. 'invariant' typically means 'something that is always true' (in my area of PL, anyway), so I think this is an apt name. I don't think I have quite understood if that is the same as what you call 'predicated subtype' or not, but I don't think it is.
@sttaft, @RalfJung - thanks for those comments. Ada (and therefore SPARK) has both forms, but uses terminology that might indeed cause confusion for others. I'm sure Tucker will correct me if I get this wrong, but in short:
More generally, the term "invariant" in programming language contexts is frequently associated with boolean conditions that are expected to be true at certain points of execution, or across certain operations, but not necessarily continuously true. Loop invariants are an example of such a condition, in that they might be violated at various points within the body of the loop, but are expected to be true at a certain point within the loop, and generally on loop exit.
The other key point is that "predicates" in Ada apply to "subtypes" which represent a subset of all of the values of a given type. The term "subtype" doesn't have this same meaning in all languages, but in Ada it represents the subset of values of a type that satisfy some "subtype constraint" and/or some "subtype predicate." Predicates can be particularly useful when specified to be the subtype of some input or output from an operation, as they effectively "bundle in" an implicit precondition, for an input, or an implicit postcondition, for an output, such as being nonzero, or in some range.
I'm still going over the draft and formulating a response, but I think another key piece of prior art to consider is the ongoing GOSPEL project for OCaml, where they designed a specification language which is agnostic to the verification technique used. It's likely that we would need something like that for Rust, to both enable researchers to continue to explore the best verification techniques and for engineers to avoid 'proof-vendor' lock-in.
Of particular importance will be the choice of primitive logical operations, particularly those concerning mutable borrows, and their semantics. Additionally, thought will need to be given to unsafe code from the start, even if tools like Creusot don't support it as we need to explain how the interaction between safe and unsafe verification will work on a logical basis.
Hi, this issue has recently been brought to my attention. Great initiative! I would also love to attend the meeting if that's possible :)
In the meantime, I want to jump into the discussion of predicated subtypes vs invariants as it's directly crossing into my turf. My hot take is that they are more related than you might think, as predicated subtypes let you encode some kinds of invariants (understood as predicates that must hold at certain points but can be broken in between)
In Flux, a refinement type can be used to specify a subset of the values denoted by a (rust) type. As such, it specifies something that is always true about the values of that refinement type. I think this matches the definition you've been giving for predicated subtypes. However, in a language with a substructural type system like Rust, the type of a value can change over time allowing you to break the invariant encoded in a refinement type. The points where you reestablish the invariant are exactly the points where you (implicitly) require the value to have the original refinement type. As an example consider a definition of a pair
struct Pair {
x: i32,
y: i32,
}
Then imagine some syntax that allows you to define ranges as pairs where y >= x
type Range = Pair with y >= x;
If you have a value v
of type Range
you can assume v.y >= v.x
, but if you ever mutate it, then v
may not have type Range
anymore. Then, whenever you use v
as a Range
(e.g., as an argument to a function taking a Range
), you have to prove it has type Range
.
fn ipa(mut p: Range) {
assert!(p.y - p.x >= 0); // safe because p has type Range
p.y -= 1; // p has no longer type Range (or rather we cannot prove it has type Range for all inputs)
p.y += 1;
ris(p); // we need to prove that p has type Range again
}
fn ris(p: Range) {}
There are several places where you implicitly assume that values have some type, e.g., if you are mutating a range inside a loop you may want to assume that the value has type Range
at the entry of the loop. And of course, things get more interesting when you have mutable references as you also assume types at the point a lifetime ends.
I don't think Rust's "validity invariants" are the same thing as predicate subtypes or refinement types. But it does seem like Ada's predicate subtypes are indeed very close to refinement types.
Validity invariants don't have to do with subtyping or refining an existing type. Even the basic type i32
comes with a validity invariant: it must be initialized.
And conversely, refinement types can refine just the safety invariant, or they can refine the safety invariant and the validity invariant. E.g. the refinement types from Flux only refine the safety invariant (I think) because the compiler will not itself perform optimizations based on "this value must always satisfy that predicate/refinement". However types like NonZero
from the standard library, or more general versions of that, also affect the validity invariant.
These two kinds of invariants are both relevant for verification, but in different ways:
Vec
, where field privacy and a module boundary is used to ensure that some properties hold true for all Vec
instances out there -- but these invariants only help verification, they are not compiler-understood, so it is okay for them to be violated internally within a functionIt seems to me like one could express safety invariants via some sort of refinement, like Flux does. This seems to be what @sttaft advised against, but I don't quite understand what the issue is with that approach.
There is an interesting kind of invariant that has not been mentioned yet in this discussion: history invariants. History invariants are generalizations of what @RalfJung calls safety invariants: instead of mentioning one value of a type, an history invariant relates two states of the same value.
History invariants are particularly useful for expressing monotonicity properties (i.e., a counter value only increases) or telling that such and such field of a data structure never changes (i.e., if a data structure contains a shared borrow to some kind of shared context, then we may want to tell that this shared borrow never changes and is equal to its original value).
I know that history invariants are already used in some ways in Prusty, and I have the feeling that they are natural for some primitive Rust types: for example, the length of a slice never changes, the fields of a closure captured by borrow always point to the same address, or a Ref/RefMut always points to the same address.
They also have a natural use in the context of mutable borrows: the value a mutable borrows points to now should always be related to the value it will point to in the future.
Do you think that would be a good notion to have in our specification language?
There is an interesting kind of invariant that has not been mentioned yet in this discussion: history invariants. History invariants are generalizations of what @RalfJung calls validity invariants:
No, if anything they are a generalization of safety invariants. Validity invariants are part of the operational semantics and don't track history.
Also of course if safety invariants are written in a sufficiently powerful separation logic, they they subsume history invariants.
Sure, my mistake... I confused the two words.
Also of course if safety invariants are written in a sufficiently powerful separation logic, they they subsume history invariants.
This is one of the key tensions I hope we can start addressing during the meeting, because while I think a separation logic is the right framework for unsafe code, Prusti and Creusot demonstrate we do not need it for (a large fragment of) safe Rust code, and instead gain strong automation. Enabling a 'smooth glide path' between complex unsafe verification in seperation logic and high-level functional verification in an automated tool like Creusot or Prusti will I believe, be key for success.
I would agree that refinement types are essentially the same as Ada's predicated subtypes. Thanks for making that connection. And it is true that if you enforce the predicates at specific points they can be similar to invariants.
In Ada we at one point tried to explain the semantics of invariants by using predicates, but it turned out only to muddy the waters. The general model of predicates is that they identify a subset of values by a condition, whereas an invariant is about safety and semantics. It is true they both use conditions applied to values of a given type, but generally to serve a rather different purpose, which means examples of using predicates and invariants will typically be rather different, and come up in rather different contexts. It is true that if you squint, they look quite similar, but the way they are intended to be used as part of proofs are almost unrelated.
Take care, -Tuck
On Tue, Nov 29, 2022 at 12:19 AM Nico Lehmann @.***> wrote:
Hi, this issue has recently been brought to my attention. Great initiative! I would also love to attend the meeting if that's possible :)
In the meantime, I want to jump into the discussion of predicated subtypes vs invariants as it's directly crossing into my turf. My hot take is that they are more related than what you might think, as predicated subtypes let you encode some kinds of invariants (understood as predicates that must hold at certain points but can be broken in between)
In Flux, a refinement type can be used to specify a subset of the values denoted by a (rust) type. As such, it specifies something that is always true about the values of that refinement type. I think this matches the definition you've been giving for predicated subtypes....Message ID: @.***>
Hi everyone, thank you so much for starting this discussion.
I'm a bit worried about the limitations that come with a "logic" that is too simple, in particular for unsafe code. I don't think this is adding any new information, I'm just joining @xldenis's opinion (and this is something we've discussed multiple times): there is a very key tension there. I'm just adding emphasis on the requirement that the contract language be expressive enough (i.e. allows me to write separation logic) "with a glide path" (i.e. being able to stay with FOL when possible, e.g. for Creusot/Prusti).
While I do very much like what most of the ideas from the draft, I find that the section about "Pointer Validity Intrinsics" is for example too limited to address a lot of unsafe unbounded properties about unsafe code, for example their safety invariants
as explained by @RalfJung .
I have filled out (somewhat briefly) the "Read/Write Sets" section in the HackMD. I could supply an example of how getting it wrong breaks soundness in SPARK, but wasn't sure if that level of detail is appropriate right now.
I finished a version of the pledges description - sorry it took a while to push the last part. Since the original text tried to presnt the problems with contracts for reborrowing in an introductory way (not Prusti-specific) and there's potentially some common material there I split the text into a part which describes (I think) aspects of the problem independently of the solution and a second part which is about Prusti. @xldenis could you check that you agree that the first part makes sense to you as well (it's not meant to be tied to the particular mechanism), please? It still uses the same example based on the earlier part of the document, but we could also replace it with get_mut
on Vec
there if you prefer (it's a better-motivated example for reborrowing specs).
Of course, any other feedback/comments are also welcome!
Oh, and I changed the TODO note to mention only Kani now :) Has anyone contacted anyone from their team?
Thanks for the very intriguing discussion so far! I agree @jhjourdan that (history) invariants are an important and interesting topic. I think there are other technical questions there as to what notion of identity to use for instances of a type (which is a place where many specification languages disagree in other settings: basically what equality/identity to build-in to contracts, if any). For history invariants especially it is important to know when they can be assumed between two program points. When one is talking about the same location in memory? That seems a bit low-level for Rust and doesn't obviously allow me to move the value and keep its invariant. On the other hand, using the deep value in memory as an identity isn't useful - the point of history invariants is usually that they govern changes to some state.
I also agree with @xldenis that there is a tension between the model one needs to prove soundness of a particular specification methodology and the surface level specifications one wants to use at the code level. These need not be the same, of course, and probably can't be if one ultimately wants a contracts proposal that doesn't fix a single verification methodology. I'm not actually sure that separation logic reasoning of some kind is not useful and suitable at a surface level for some safe Rust specifications (and arguably Rust is a good setting for programmers to think in terms of resources and controlled side-effects when specifying their code), but I'm sure that one does not want specifications as rich as one needs to prove soundness, and indeed in both Prusti and Creusot it's clear that for a lot of code the ownership aspects can be simply lifted from the type information in the language.
It will be a very interesting aspect of the discussion to try to figure out how much "non-Rust" one can add to a contract language while supporting what various tools and methodologies would like to do individually.
@xldenis could you check that you agree that the first part makes sense to you as well (it's not meant to be tied to the particular mechanism), please?
It looks good to me and I think it makes clear the idea that some form of future operator seems essential to Rust specifications. I like the index_mut
example since it's more concrete but we don't have to change it.
I think there are other technical questions there as to what notion of identity to use for instances of a type (which is a place where many specification languages disagree in other settings: basically what equality/identity to build-in to contracts, if any).
Fully agree here, this is a point @jhjourdan and I have had many discussions about just in the context of Creusot.
On the other hand, using the deep value in memory as an identity isn't useful - the point of history invariants is usually that they govern changes to some state.
I don't understand the contradiction here? How would using a 'deep value identity' weaken history invariants? The point is we would only be allowed to make changes which respect the invariant, no?
I also agree with @xldenis that there is a tension between the model one needs to prove soundness of a particular specification methodology and the surface level specifications one wants to use at the code level.
I agree here, but that wasn't the point I meant to make. My original point was the surface level specifications needed for unsafe and safe code have different requirements. To specify unsafe operations on raw pointers, it seems natural that we would want some form of separation logic. Simultaneously, when working with safe code (on vectors for example), both Creusot and Prusti have shown that we don't need or want that as surface syntax. As you noted, the ownership typing of Rust simplifies the task dramatically for us.
That said, I also agree with you that the potential of ownership typing for proof and specification is not exhausted: consider the work done in Verus with their linear 'proof-mode' and ghost token system.
To summarize, I believe that just how unsafe
Rust is a language which contains safe Rust, a Rust specification language will probably be structured the same way: a more expressive 'unsafe' (sep. logic) spec language which contains the 'safe' (fol) sub-language.
I have an additional concern about this meeting, what is the actual agenda?
I am happy to talk about Rust specifications for as long as you would like, but if we don't have a clear and firm goal I worry we will just have a great discussion and nothing will move forward.
To my mind, the agenda is to get feedback from the rust language team to understand:
If adding specifications to rust is something the community and the language team wins interested in and will support.
If this is something the community supports, what's the best way to drive this effort?
Get a general sense of what the desired goals for specifications are, and their relative importance to the rust community.
If time allows, discuss some of the key questions about what the spec language should look like, and how we might start implementing it.
Those seem like reasonable objectives and a reasonable order to address them in. I would perhaps add a point 5.
or 4.1
:
What changes to the compiler / language will be needed to integrate a specification language?
I'd be very interested in joining this meeting; is there already a precise time and location?
Given the volume of debate above relating to things like "type invariant", "subtype", "subset type", and so on, I added "0. Terminology" to the draft roadmap.
I'd be very interested in joining this meeting; is there already a precise time and location?
Normally December 7th (tomorrow) at 19:00 Paris time. Confirmations would be welcome but that's what I originally noted and booked. @rod-chapman @danielsn.
Is there an invite or link for the meeting? I believe @nilehmann wanted to join as well. @rod-chapman should have our emails.
Here's the lang-team meeting calendar - see the entry for tomorrow: https://calendar.google.com/calendar/u/0/embed?src=recud4b9o8cmc0m5rmr033p5nk@group.calendar.google.com&pli=1
Great, thank you!
Zulip thread for discussing follow-up meetings:
Hi all, I'm very sorry that I couldn't join this meeting (unfortunately for family reasons), but I'd be very keen to keep up with the discussion and join a future meeting. Thanks for all the fascinating discussion online so far!
Summary
A key step in the process of allowing rust users to verify their code (whether through testing, fuzzing, model-checking, or deductive verification), is to provide them with an ergonomic mechanism to specify the expected behaviour of their code. The purpose of this meeting is present and get feedback on the issues surrounding the design of a contract language for Rust.
Several groups are already working on this topic - notably the developers of tools like Kani, Prusti and Creusot - and have made significant progress. Currently, each of these tools defines its own contract language, so there is a risk of divergence and incompatibility both now and in the future. In addition, the full utility of a contract language requires language-level support. For example, the safety of a contract would benefit from language-level support for pure functions. Furthermore, the value of object invariants and trait laws increases the more tightly they are integrated into the Rust type-system and compiler.
This is an initial meeting to open conversation around the issues. We do not expect to have a fully-fledged and implemented proposal.
Document link: https://hackmd.io/w8AS2N09R76aXDFK9ogHBQ?view
Background reading
Prior art in the Rust community:
Prior art in other languages: