rust-lang / polonius

Defines the Rust borrow checker.
Apache License 2.0
1.33k stars 74 forks source link

friendlier front end #26

Open nikomatsakis opened 6 years ago

nikomatsakis commented 6 years ago

Right now, the "input" to this program is a bunch of facts you could never generate by hand. We should add some sort of front-end more like chalk, that takes a human-writable format -- perhaps similar to MIR? And generates facts from it. This would allow us to do more unit testing.

It'd probably be good then to factor out the "core algorithm" into a separate crate, so that it can eventually be shared by rustc and this crate.

I'd be open to suggestions for what the input format should look like =)

nikomatsakis commented 6 years ago

I've been debating about this. I think my ideal would be that we take in MIR and generate facts, but that would require reproducing a lot of the compiler. Obviously the simplest format is the one we already have: just give me the raw facts. So the question is, which facts should we "derive" from a nicer input and which can the user manually specify?

It seems to me that by far the most tedious to derive are cfg_edge and region_live_at. So pehaps we should focus on those, and have the user manually specify the borrow_region, outlives, and invalidates themselves. We can then (over time) transition to the rest.

I envision something like this. First, a program consists of a universal_regions declaration and a set of basic blocks:

universal_regions { 'a, 'b, 'c }
block B0 {
    statement0; // series of `;`-separated statements, defined below
    statement1;
    goto B1, B2, B3; // create edges between basic blocks
}

This is kind of a grammar like this (using a LALRPOP-like format):

Input = UniversalRegions BlockDefn*;
UniversalRegions = "universal_regions" "{" Comma<Region> "}";
BlockDefn = "block" Block "{" Statement* Goto? "}"
Goto = "goto" Comma<Block> ";"

Each statement would be a comma-separate list of facts that occur on that line, or possibly a use. So something like:

use('a), outlives('a: 'b), borrow_region_at('b, 'c)

which might fit this grammar:

Statement = { Fact, Use };
Fact = {
  "outlives" "(" Region ":" Region ")",
  "borrow_region_at" "(" Region "," Loan ")",
  "invalidates" "(" Loan ")",
};
Use = "use" "(" Region ")" 

And of course we need the base strings:

Block = r"B\w+"; // or whatever
Region = r"'\w+"; // ...
Loan = r"L\w+"; // ...

We would then convert these into tuples of AllFacts and run the computation.