Proof of Insertion Sort implementation times out

FStarLang / FStar

A Proof-oriented Programming Language

https://fstar-lang.org

Apache License 2.0

2.7k stars 233 forks source link

Proof of Insertion Sort implementation times out #1436

Closed Caroline-xinyue closed 6 years ago

Caroline-xinyue commented 6 years ago

Hi F* developers (@nikswamy , @tahina-pro ),

I am working on my undergraduate thesis on a comparison between Dependent Haskell and F*. During the comparison of proof complexity, I implemented the InsertionSort as follow: https://gist.github.com/Caroline-xinyue/51eb00edca0427199cd7d1d2642205f1. However compiling the code gives me the following error.

./InsertionSort.fst(25,0-27,64): (Error) Unknown assertion failed (SMT solver says: unknown because canceled (fuel=8; ifuel=2; ); SMT solver says: unknown because max. resource limit exceeded (fuel=4; ifuel=2; ); SMT solver says: unknown because canceled (fuel=2; ifuel=2; ); SMT solver says: unknown because canceled (fuel=2; ifuel=1; )) Verified module: InsertionSort (29441 milliseconds) 1 error was reported (see above)

The original InsertionSort implementation provided only ensures permutation without duplication, i.e. {2,2,3} and {2,3,3} will pass the original insertionSort verification (https://github.com/FStarLang/FStar/blob/fe09a9dbe2dce098c35be650287b61298376ecb2/examples/algorithms/InsertionSort.fst).

I've also tried explicitly calling the SMT lemma smaller_hd_lemma as in https://gist.github.com/Caroline-xinyue/80e271772b13b2bcf5e3e311c04fb3dc but SMT still runs out of time, as follows:

./InsertionSort.fst(25,0-33,38): (Error) Unknown assertion failed (SMT solver says: unknown because canceled (fuel=8; ifuel=2; ); SMT solver says: unknown because canceled (fuel=4; ifuel=2; ); SMT solver says: unknown because canceled (fuel=2; ifuel=2; ); SMT solver says: unknown because canceled (fuel=2; ifuel=1; )) Verified module: InsertionSort (22286 milliseconds) 1 error was reported (see above)

My thesis is due next Wednesday. I want to conclude that F has an intuitive user interface with elegant syntax and semi-automation and an engaging proof experience, but the comparison wouldn't hold if my F version doesn't verify the complete permutation of insertionSort. I will be deeply and sincerely appreciated for any help!

Thanks!!! Xinyue(Elise) -- An F* programmer

Caroline-xinyue commented 6 years ago

I continue to try playing with the fuel, but I don't know enough about how fuel works. It seems like SMT solver failed every time because verification is cancelled no matter how large I increase the fuel to.

Here is what I've tried with fuel: #set-options "--initial_fuel 10 --max_fuel 10 --initial_ifuel 10 --max_ifuel 10"

A-Manning commented 6 years ago

You can try increasing your rlimit. #set-options "--z3rlimit 10"

A good pattern when working with several propositions is to write your function with minimal propositions, prove each property individually, and then construct the function with all the propositions that you need.

For your example, that would look like

val insert' : nat -> l:list nat{sorted l} -> list nat

val insert'_sorted: x:nat -> l:list nat{sorted l} -> Lemma (sorted (insert' n l))

val insert'_count_elem: x:nat -> l:list nat{sorted l} 
  -> Lemma (forall k. count_elem k (insert' x l) = eq k x + count_elem k l)

val insert: x:nat -> l:list nat{sorted l} -> Tot (result:list nat{sorted result /\ (forall k. count_elem k result = eq k x + count_elem k l)})
let insert x l = insert'_sorted x l;
                 insert'_count_elem x l;
                 insert' x l

Caroline-xinyue commented 6 years ago

Hi Manning @A-Manning ,

Thank you so much for the detailed response! I tried your approach and it seems to work once I improve the rlimit to 25: https://gist.github.com/Caroline-xinyue/8b35eae6087779d07eabaacd39cd9dc9. This proof, though in the extrinsic style using additional lemmas, is very elegant and intuitive!

I am considering rewriting my Results section to discuss this new implementation, but I am not very familiar with how fuel or SMT rlimits works? In my understanding, SMT seems to have a limit (maybe time in sec) that controls how long it will try discharging the proof.

Would you mind explaining in a little detail what this parameter 10 or 25 means in #set-options "--z3rlimit 25"? Also, what is F*'s design decision on this rlimit when integrating with the Z3 SMT solver to achieve semi-automation.

Thanks again for your kind help!

Sincerely, Xinyue(Elise)

nikswamy commented 6 years ago

Hi there.

Increasing the rlimit is one way to go about it and suffices for small examples. But, it's a slippery slope in larger contexts. This page provides some information on rlimits: https://github.com/FStarLang/FStar/wiki/rlimits:-Machine-Independent-Resource-Limits-for-Deterministic-Execution

Another way to go about it is below. I've decorated the quantified formula in insert_permutation and insert_pf with "pattern" (aka e-matching triggers) to guide the SMT solver towards only instantiating the quantified formula as I wish. Quantifiers and patterns are an advanced topic and require some experience to use well, but see here for one place to start: https://github.com/FStarLang/FStar/wiki/Quantifiers-and-patterns See also section 6.1.3 of the tutorial.

module InsertionSortFull
val sorted: list nat -> Tot bool
let rec sorted l = match l with
    | [] -> true
    | [x] -> true
    | x :: y :: xs -> (x <= y) && (sorted (y :: xs))

val eq: nat -> nat -> Tot nat
let eq x y = if x = y then 1 else 0

val count_elem: nat -> list nat -> Tot nat
let rec count_elem x l = match l with
  | [] -> 0
  | hd :: tl -> eq hd x + count_elem x tl

val insert : x:nat -> l1:list nat{sorted l1} -> list nat
let rec insert x l1 = match l1 with
  | [] -> [x]
  | hd :: tl -> if x <= hd then x :: l1 else hd :: (insert x tl)

val insert_sorted: x:nat -> l1:list nat{sorted l1} -> Lemma (sorted (insert x l1))
let rec insert_sorted x l1 = match l1 with
  | [] -> ()
  | hd :: tl -> insert_sorted x tl

val insert_permutation: x:nat -> l1:list nat{sorted l1}
  -> Lemma (forall k.{:pattern (count_elem k (insert x l1))} count_elem k (insert x l1) = eq k x + count_elem k l1)
let rec insert_permutation x l1 = match l1 with
  | [] -> ()
  | hd :: tl -> insert_permutation x tl

val insert_pf : x:nat -> l1:list nat{sorted l1} -> Tot (l2:list nat{sorted l2 /\ (forall k.{:pattern (count_elem k l2)} count_elem k l2 = eq k x + count_elem k l1)})
let rec insert_pf x l1 = insert_sorted x l1;
                         insert_permutation x l1;
                         insert x l1

val insertion_sort : l1:list nat -> Tot (l2:list nat{sorted l2 /\ (forall k. count_elem k l2 == count_elem k l1)})
let rec insertion_sort l1 = match l1 with
    | [] -> []
    | hd :: tl -> insert_pf hd (insertion_sort tl)

Caroline-xinyue commented 6 years ago

Thank you all for the timely advice! @A-Manning @nikswamy

Caroline-xinyue commented 6 years ago

@nikswamy Sorry to bother again!

As @A-Manning mentioned, "A good pattern when working with several propositions is to write your function with minimal propositions, prove each property individually, and then construct the function with all the propositions that you need." However, if we implement our function by calling all these individual propositions, as in the insert_pf below, are we slowing down the run time performance of our verified insertion sort in F*?

val insert_pf : x:nat -> l1:list nat{sorted l1} -> Tot (l2:list nat{sorted l2 /\ (forall k.{:pattern (count_elem k l2)} count_elem k l2 = eq k x + count_elem k l1)})
let rec insert_pf x l1 = insert_sorted x l1;
                         insert_permutation x l1;
                         insert x l1

Since F*, by design, ensures that every proposition is correct as long as it passes type check, there's no need to run neither insert_sorted nor insert_permutation at run time (when transforming to run in F#, OCaml or C). What we really want is only the insert function as defined.

I am thus wondering what F* do with these additional proposition lemmas after compilation/type check. Do we perform any sort of erasure(like modifying insert_pf to remove calling those lemmas) before extracting to run in F# etc?

Otherwise, is there a way to provide patterns to Z3 without explicitly calling the additional lemmas. In other words, as all we need are the refinement types from insert_sorted and insert_permutation, is there a way to construct our insert_pf function with type-level propositions (sorted (insert x l1) and forall k. count_elem k (insert x l1) = eq k x + count_elem k l1) only, instead of calling the actual term-level recursive implementations(insert_sorted and insert_permutation) as right now.

Thanks! Xinyue

nikswamy commented 6 years ago

$ fstar --codegen OCaml --extract InsertionSortFull InsertionSortFull.fst
Extracted module InsertionSortFull
Verified module: InsertionSortFull (1228 milliseconds)
All verification conditions discharged successfully

$ cat InsertionSortFull.ml
open Prims
let rec (sorted : Prims.nat Prims.list -> Prims.bool) =
  fun l  ->
    match l with
    | [] -> true
    | x::[] -> true
    | x::y::xs -> (x <= y) && (sorted (y :: xs))

let (eq : Prims.nat -> Prims.nat -> Prims.nat) =
  fun x  ->
    fun y  -> if x = y then (Prims.parse_int "1") else (Prims.parse_int "0")

let rec (count_elem : Prims.nat -> Prims.nat Prims.list -> Prims.nat) =
  fun x  ->
    fun l  ->
      match l with
      | [] -> (Prims.parse_int "0")
      | hd::tl -> (eq hd x) + (count_elem x tl)

let rec (insert : Prims.nat -> Prims.nat Prims.list -> Prims.nat Prims.list)
  =
  fun x  ->
    fun l1  ->
      match l1 with
      | [] -> [x]
      | hd::tl -> if x <= hd then x :: l1 else hd :: (insert x tl)

let rec (insert_pf :
  Prims.nat -> Prims.nat Prims.list -> Prims.nat Prims.list) =
  fun x  -> fun l1  -> insert x l1
let rec (insertion_sort : Prims.nat Prims.list -> Prims.nat Prims.list) =
  fun l1  ->
    match l1 with | [] -> [] | hd::tl -> insert_pf hd (insertion_sort tl)

Lemmas are erased.

More generally, all pure or ghost computations with "must_erase" types are erased, where must_erase is

must_erase ::= unit
            | Type
            | FStar.Ghost.erased t
            | x:must_erase{t'}            //any refinement of a must_erase type
            | t1..tn -> PURE must_erase _ //any pure function returning a must_erase type
            | t1..tn -> GHOST t' _        //any ghost function

Caroline-xinyue commented 6 years ago

@nikswamy Thank you sincerely for the quick response!

Here is what I understand: the explicit style of proving, e.g. insert_sorted_pf and insert_permutation_pf above, is only called at compile type to assist with Z3 and to speed up the verification. However, during runtime extraction to OCaml, all of these supporting lemmas are erased (since Lemma is a syntactic sugar for a ghost total effect with type must_erase) and are not actually executed at run time. Therefore, run time performance is actually not sacrificed here.

Thanks again for your help!

Best regards, Xinyue

A-Manning commented 6 years ago

@Caroline-xinyue Exactly. Can this issue be closed?