rust-lang / lang-team

Home of the Rust lang team
http://lang-team.rust-lang.org/
Apache License 2.0
200 stars 48 forks source link

Initiative: Ghost types and blocks #161

Closed utaal closed 1 year ago

utaal commented 2 years ago

Proposal: Ghost types and blocks

Summary and problem statement

A mechanism to have Rust code that is type-checked and (optionally) borrow-checked, retrievable via HIR/THIR/MIR, but "erased" from the final binary.

Many of the tools for verification of Rust software require the user to provide specifications, proofs, and other annotations in the Rust surface syntax. These are (an extension of) Rust syntax, and need to be type-checked and optionally borrow-checked. However, this code must not appear in the final binary: it is "ghost".

Motivation, use-cases, and solution sketches

Software verification tools allow for properties of code to be formally verified or exhaustively checked. These tools need a way to

A user of one of these tools may state a property they want to verify as an "annotation" to a function:

#[ensures(
    return == 
        if n == 0 { 0 } else if n == 1 { 1 }
        else { fibo(n - 2) + fibo(n - 1) }
)]
fn fibo(n: u64) -> u64 {
    if n == 0 { return 0; }
    let mut prev = 0;
    let mut cur = 1;
    for _ in 1..n {
        let new_cur = cur + prev;
        prev = cur;
        cur = new_cur;
    }
    cur
}

Tools need to resolve and type-check (and sometimes, borrow-check) such annotations; however, such code should not be emitted in compiled code.

These tools (often) run as compiler extensions. If it was possible to write such properties and annotations with an unmodified rust compiler, these tools could access them from HIR/THIR/MIR (after type- and, optionally, borrow-checking)[^1].

A possible solution discussed by

after the Rust Verification workshop^2 is the addition of "ghost" types (or a Ghost<T> type wrapper as a lang item) and "ghost" expressions which are type-checked and (optionally) borrow-checked: they appear in HIR/THIR/MIR, but are then erased before machine code is emitted.

We plan to add two features:

Ghost types, and the raise operator

Either a new ghost terminator as a type constructor or a lang-item Ghost<T> wrapper type.

struct BinaryTree<V> {
    left: Option<Box<BinaryTree<V>>>,
    right: Option<Box<BinaryTree<V>>>,
    value: V,
    contents: ghost &[V], // An abstract representation of the contents of this subtree
}

ghost V is a zero-sized type so that, in the example, the contents field for BinaryTree is erased (similarly to PhantomData<T>). The same applies to variables of type ghost T.

A ghost T can be raised to a T, but only within a ghost block (it's an error otherwise): if expr is of type ghost T, raise expr is of type T. We may want raise to behave, for type- and borrow-checking purposes, like a function raise<T>(g: ghost T) -> T (consuming its argument, or copying it if it's Copy). It could just be a lang-item function, too.

Ghost expressions

Similar to unsafe blocks,

ghost(tag, mode) { exprs.. }

The type of a ghost expression is always ghost T: if the inner block type is not already ghost T for some T, it is promoted to ghost T.

The ensures clause from above could be expanded to something like:

fn fibo(n: u64) -> u64 {
    ghost(ensures, no_init) {
        let result: u64; // it would be necessary to name the return value
        result == 
            if n == 0 { 0 } else if n == 1 { 1 }
            else { fibo(n - 2) + fibo(n - 1) } } 

    // body of fibo
}

Similar blocks would represent other verifier annotations, such as function preconditions, loop invariants, and proofs.

Ghost variables could be then defined with something like:

let contents = ghost(proof, default) { &[3u64, 5] };

contents here would have type ghost &[u64].

Adding a ghost expression to any compiling program should either (a) not modify the behavior of the resulting binary or (b) raise a compilation error.

[^1]: Tools currently need workarounds such as macros that have two or three definitions: one for the verifier pass, one for emitting compiled code, and (for Verus) one for an additional verifier pass to borrow-check some ghost variables. These are currently necessary because specifications and annotations should not appear in the compiled code (sometimes they are not even compilable), and (for Verus) some ghost variables should be borrow-checked. This also mean that tools need to run rustc two or (for Verus) three times, sometimes carrying state between the runs.

We may be able to lower a ghost expression expr as if false { expr } in default mode and to loop { break; expr } in no_init mode.

Links and related work

This is, in part, based on the needs of various verification tools for Rust:

Other tools (Aeneas, Gillian-Rust, ...) may also benefit from these features.

https://gist.github.com/utaal/aba64ad723c65068247af00c63756e10 contains multiple examples of use cases for ghost blocks; https://gist.github.com/utaal/aba64ad723c65068247af00c63756e10#file-b2-fibo-desugar-rs is the most up-to-date mock-up following this proposal.

Definitions for ghost types and expressions in F*, https://github.com/FStarLang/FStar/blob/master/ulib/FStar.Ghost.fsti#L50-L59

Initial people involved

What happens now?

This issue is part of the lang-team initiative process. Once this issue is filed, a Zulip topic will be opened for discussion, and the lang-team will review open proposals in its weekly triage meetings. You should receive feedback within a week or two.

This issue is not meant to be used for technical discussion. There is a Zulip stream for that. Use this issue to leave procedural comments, such as volunteering to review, indicating that you second the proposal (or third, etc), or raising a concern that you would like to be addressed.

rustbot commented 2 years ago

This issue is not meant to be used for technical discussion. There is a Zulip stream for that. Use this issue to leave procedural comments, such as volunteering to review, indicating that you second the proposal (or third, etc), or raising a concern that you would like to be addressed.

nikomatsakis commented 2 years ago

@rustbot second

Pzixel commented 2 years ago

I've tried to read this twice already but still don't get the idea. Could you please elaborate a bit? From your first example I deduce that you want some kind of attributes that compiler should check against the actual method body right? Because AFAIK you need funext to actually prove that these two implementations are correct and this is something that is axiom in most provers I've used so far. Thus I don't see how compiler in theory could prove this is correct...

RobJellinghaus commented 2 years ago

@Pzixel your question is more appropriate for the Zulip than for this github thread, per @rustbot's comment above. It looks like this is the current stream: https://rust-lang.zulipchat.com/#narrow/stream/324345-t-lang.2Fghost-code

nikomatsakis commented 1 year ago

I'm going to close this. As far as I know, we haven't done any active hacking or created a tracking issue, and I think the appropriate next step (per our new process) would be to start a lang-team experiment. I am no longer able to serve as champion, though, simply for reasons of time.