c-cube / iter

Simple iterator abstract datatype, intended to iterate efficiently on collections while performing some transformations.
https://c-cube.github.io/iter
BSD 2-Clause "Simplified" License
119 stars 11 forks source link
higher-order-functions iter iterators lazy monad ocaml sequence stream

Iter build

Clean and efficient loop fusion for all your iterating needs!

# #require "iter";;
# let p x = x mod 5 = 0 in
  Iter.(1 -- 5_000 |> filter p |> map (fun x -> x * x) |> fold (+) 0);;
- : int = 8345837500

Iter is a simple abstraction over iter functions intended to iterate efficiently on collections while performing some transformations. Common operations supported by Iter include filter, map, take, drop, append, flat_map, etc. Iter is not designed to be as general-purpose or flexible as Seq. Rather, it aims at providing a very simple and efficient way of iterating on a finite number of values, only allocating (most of the time) one intermediate closure to do so. For instance, iterating on keys, or values, of a Hashtbl.t, without creating a list. Similarly, the code above is turned into a single optimized for loop with flambda.

Documentation

There is only one important type, 'a Iter.t, and lots of functions built around this type. See the online API for more details on the set of available functions. Some examples can be found below.

The library used to be called Sequence. Some historical perspective is provided in this talk given by @c-cube at some OCaml meeting.

Short Tutorial

Transferring Data

Conversion between n container types would take n² functions. In practice, for a given collection we can at best hope for to_list and of_list. With iter, if the source structure provides a iter function (or a to_iter wrapper), it becomes:

# let q : int Queue.t = Queue.create();;
val q : int Queue.t = <abstr>
# Iter.( 1 -- 10 |> to_queue q);;
- : unit = ()
# Iter.of_queue q |> Iter.to_list ;;
- : int list = [1; 2; 3; 4; 5; 6; 7; 8; 9; 10]

# let s : int Stack.t = Stack.create();;
val s : int Stack.t = <abstr>
# Iter.(of_queue q |> to_stack s);;
- : unit = ()
# Iter.of_stack s |> Iter.to_list ;;
- : int list = [10; 9; 8; 7; 6; 5; 4; 3; 2; 1]

Note how the list of elements is reversed when we transfer them from the queue to the stack.

Another example is extracting the list of values of a hashtable (in an undefined order that depends on the underlying hash function):

# let h: (int, string) Hashtbl.t = Hashtbl.create 16;;
val h : (int, string) Hashtbl.t = <abstr>
# for i = 0 to 10 do
     Hashtbl.add h i (string_of_int i)
  done;;
- : unit = ()

# Hashtbl.length h;;
- : int = 11

# (* now to get the values *)
  Iter.of_hashtbl h |> Iter.map snd |> Iter.to_list;;
- : string list = ["6"; "2"; "8"; "7"; "3"; "5"; "4"; "9"; "0"; "10"; "1"]

Replacing for loops

The for loop is a bit limited, and lacks compositionality. Instead, it can be more convenient and readable to use Iter.(--) : int -> int -> int Iter.t.

# Iter.(1 -- 10_000_000 |> fold (+) 0);;
- : int = 50000005000000

# let p x = x mod 5 = 0 in
  Iter.(1 -- 5_000
    |> filter p
    |> map (fun x -> x * x)
    |> fold (+) 0
  );;
- : int = 8345837500

NOTE: with flambda under sufficiently strong optimization flags, such compositions of operators should be compiled to an actual loop with no overhead!

Iterating on sub-trees

A small λ-calculus AST, and some operations on it.

# type term =
  | Var of string
  | App of term * term
  | Lambda of term ;;
type term = Var of string | App of term * term | Lambda of term

# let rec subterms : term -> term Iter.t =
  fun t ->
    let open Iter.Infix in
    Iter.cons t
      (match t with
      | Var _ -> Iter.empty
      | Lambda u -> subterms u
      | App (a,b) ->
        Iter.append (subterms a) (subterms b))
  ;;
val subterms : term -> term Iter.t = <fun>

# (* Now we can define many other functions easily! *)
  let vars t =
    Iter.filter_map
      (function Var s -> Some s | _ -> None)
      (subterms t) ;;
val vars : term -> string Iter.t = <fun>

# let size t = Iter.length (subterms t) ;;
val size : term -> int = <fun>

# let vars_list l = Iter.(of_list l |> flat_map vars);;
val vars_list : term list -> string Iter.t = <fun>

Permutations

Makes it easy to write backtracking code (a non-deterministic function returning several 'a will just return a 'a Iter.t). Here, we generate all permutations of a list by enumerating the ways we can insert an element in a list.

# open Iter.Infix;;
# let rec insert x l = match l with
  | [] -> Iter.return [x]
  | y :: tl ->
    Iter.append
      (insert x tl >|= fun tl' -> y :: tl')
      (Iter.return (x :: l)) ;;
val insert : 'a -> 'a list -> 'a list Iter.t = <fun>

# let rec permute l = match l with
  | [] -> Iter.return []
  | x :: tl -> permute tl >>= insert x ;;
val permute : 'a list -> 'a list Iter.t = <fun>

# permute [1;2;3;4] |> Iter.take 2 |> Iter.to_list ;;
- : int list list = [[4; 3; 2; 1]; [4; 3; 1; 2]]

Advanced example

The module examples/sexpr.mli exposes the interface of the S-expression example library. It requires OCaml>=4.0 to compile, because of the GADT structure used in the monadic parser combinators part of examples/sexpr.ml. Be careful that this is quite obscure.

Comparison with Seq from the standard library, and with Gen

In short, 'a Seq.t is more expressive than 'a Iter.t, but it also requires more knowledge of the underlying source of items. For some operations such as map or flat_map, Iter is also extremely efficient and will, if flambda permits, be totally removed at compile time (e.g. Iter.(--) becomes a for loop, and Iter.filter becomes a if test).

For more details, you can read http://gallium.inria.fr/blog/generators-iterators-control-and-continuations/ or see the slides about Iter by me (c-cube) when Iter was still called Sequence.

Build

  1. via opam opam install iter
  2. manually (need OCaml >= 4.02.0): make all install

If you have qtest installed, you can build and run tests with

$ make test

If you have benchmarks installed, you can build and run benchmarks with

$ make benchs

To see how to use the library, check the following tutorial. The tests and examples directories also have some examples, but they're a bit arcane.

License

Iter is available under the BSD license.