Open W95Psp opened 10 months ago
We want to interpret Rust iterators as functional maps. Rust iterators are not easy to translate to our backends: e.g. in F*, we cannot represent iterators in their full generality (cough coinduction cough), lazy iterators are not productive in proof settings.
Currently, our main target is for
loops. For instance:
for item in source {..}
, where source
is already concrete (e.g. a vector, a slice, an array, a slice.chunks(N)
);for index in range {..}
, where range
is a range (e.g. 0..10
);for (i, item) in it.enumerate() {..}
, where it
is an iterator;for item in it.step_by(n) {..}
, where it
is an iterator.Behind the scenes, Rust desugars that to a regular loop
and wraps
the collection we're iterating over in a into_iter
call. for
loops
have thus nothing special.
Between a "source" (something which is not an iterator but has an
IntoIter
instance or methods to produce iterators) and a "sink"
(e.g. into_iter
, collect
or basically any function consuming an
iterator without producing a new iterator), there can be various
combinators (e.g. enumerate
, step_by
, zip
...).
We need to recognize those sources, combinators and sinks.
The idea would be to translate iterators to functional maps: functions
from indexes (e.g. nat
) to values.
Functions are a very thin layer to whatever the source is. For
instance, if the source is an F* list, creating a functional map is
trivial, with FStar.List.Tot.get
. Similarly, a step_by
combinator
or a skip
combinator is trivial to implement, and very nice to work
with.
Loops can be translated to a fold
on maps.
Example of what those map could look like in F*:
module ItTest
open FStar.Mul
let monotonic #t (f: nat -> option t) =
forall i j. (j > i /\ None? (f i)) ==> None? (f j)
type map (t: Type) = f: (nat -> option t) {monotonic f}
let terminates (m: map 't) = exists n. None? (m n)
let termination_index (m: map 't {terminates m}): GTot (i:nat {
None? (m i)
/\ (forall j. j < i ==> Some? (m i))
}) = admit ()
let enumerate (f: map 'a)
: map (tuple: (nat & 'a) {Some tuple._2 == f tuple._1})
= fun i -> match f i with
| Some x -> Some (i, x)
| None -> None
let step_by (f: map 'a) (step: nat): map 'a
= fun i -> f (step * i)
let take (f: map 'a) (n: nat): m: map 'a {terminates m}
= fun i -> if i < n then f i else None
let skip (f: map 'a) (n: nat): map 'a
= fun i -> f (i + n)
let rec find_index'
(m: map 'a {terminates m})
(f: 'a -> bool)
(i: nat {i <= termination_index m})
: Tot (option nat)
= match m i with
| Some v -> if f v
then Some i
else find_index' m f (i + 1)
| None -> None
let find_index (m: map 'a {terminates m}) (f: 'a -> bool) =
find_index' m f 0
let rec find_nth (m: map 'a {terminates m}) (f: 'a -> bool) (n: nat): option nat =
if n = 0
then find_index m f
else
match find_nth m f (n - 1) with
| Some i -> find_index (skip m (i + 1)) f
| None -> None
let filter (m: map 'a {terminates m}) (f: 'a -> bool): map 'a =
fun i -> match find_nth m f i with
| Some i -> m i
| None -> None
let to_list (m: map 'a {terminates m}) =
let rec aux (i: nat {i <= termination_index m}) =
match m i with
| Some x -> x::aux (i+1)
| None -> []
in aux 0
let of_list (l: list 'a): map 'a =
fun i ->
if i < FStar.List.Tot.length l
then Some (FStar.List.Tot.index l i)
else None
let rec list_slice
(l: list 'a)
(start: nat)
(len: nat {len + start <= List.Tot.length l})
: Tot (l': list 'a {List.Tot.length l' == len}) (decreases len)
= if len = 0
then []
else List.Tot.index l start::list_slice l (start + 1) (len - 1)
let chunks_exact (l: list 'a) (n: pos)
: map (l: list 'a {List.Tot.length l == n})
= fun (i: nat) -> if i * n + n <= List.Tot.length l
then Some (list_slice l (i * n) n)
else None
let rec fold' #acc (m: map 'a {terminates m}) (f: acc -> 'a -> acc) (init: acc) (i: nat {i <= termination_index m})
: acc
= match m i with
| Some v -> f (fold' m f init (i + 1)) v
| None -> init
let fold #acc (m: map 'a {terminates m}) (f: acc -> 'a -> acc) (init: acc): acc
= fold' m f init 0
let it1: map int = take (fun i -> Some (i + 100)) 100
let it2: map int = filter it1 (fun n -> n % 3 = 0)
let property = assert_norm (forall i. Some? (it2 i) ==> Some?.v (it2 i) % 3 == 0)
let folded = fold (take (enumerate it2) 20)
(fun acc ((i, item): (nat & int)) ->
assert (Some?.v (it2 i) == item);
acc + 1
) 0
At first, I wanted to eliminate iterators altogether: never extract
any Iterator
instances, kill all impl Iterator
argument to any
function, remove iterator-related generics and bounds whenver
possible.
The idea was to replace any type that is iterable by one unique concerete type: functional maps.
This is not possible because sometimes, we work with a type T
that
is not an opaque iterator type, but rather either a concrete type or
a opaque type with more constraint on it. For instance, in fn f<T: Iterator + Foo>(x: T)
, on can apply any method of Foo
on x
.
Then, I started desining a system in which every function that accepts a completly opaque iterator type should be rewritted as a function that accepts a functional map instead.
Then, when a function f
with an opaque implicit argument is called,
i.e. f(..., it, ...)
where f
expect it
to be of type impl Iterator
,
we rewrite it into f(..., it.to_seq(), ...)
, and continue this
transformation on the expression it
.
But then, complications arise:
Iterator
trait itself? Iterator
super trait bound?The current design is a restricted approach to design B.
The filter isn't about opaque iterators any longer: it is just an allowlist of functions (or method) and which arguments should be mapped to sequences.
My hope is to implement design C and at some point swap the "filter" from this allowlist to opaque iterators.
One important point is that for loops implicitely add into_iter
calls, which is a "opaque iterator", which would trigger the
transformation into sequences.
This issue has been marked as stale due to a lack of activity for 60 days. If you believe this issue is still relevant, please provide an update or comment to keep it open. Otherwise, it will be closed in 7 days.
Still somehow relevant, I made some basic support in:
Currently, we translate Rust iterators as such in all backends.
Rust iterators are lazy and represented by concrete data types: each combinator (e.g.
map
,zip
,enumerate
...) builds a concrete structure describing the semantics of the iterator. Giveni
an iterator,i.enumerate()
is basically a tuple containingi
and a counter.While this is great for expressivity and performance, that's not often ideal in proof settings.
Characterizing a (non-lazy) iterator DSL for which we offer specific support in the backend would be great. This DSL would help us getting rid of the concrete data type Rust approach, in favor of overloaded combinators (eg
enumerate
,map
) that work directly on collections, without indirections.