epfl-lara / stainless

Verification framework and tool for higher-order Scala programs
https://epfl-lara.github.io/stainless/
Apache License 2.0
356 stars 51 forks source link

Forall and HOF predicates #94

Closed mantognini closed 6 years ago

mantognini commented 6 years ago

I've been trying to prove a few things with higher order functions and forall. The main trouble I have is to understand how to express things such that stainless can (dis-)prove my lemmas.

Take this toy example.

I basically want to prove stuff like this:

  @induct
  def lemmaV1[T](xs: List[T], p1: T => Boolean, p2: T => Boolean): Boolean = {
    require(forall{ (z: T) => p1(z) ==> p2(z) })
    xs.forall(p1) ==> xs.forall(p2)
  }.holds

for specific predicates:

    val p1 = { (xs: List[T]) => xs.size > 0 }
    val p2 = { (xs: List[T]) => xs.size > 0 && (xs.tail.size == xs.size - 1) }

(The lemma are instantiated with T = List[T'].)

As a matter of fact, I can manually prove the precondition of lemmaV1 holds, but stainless still complains that the VC for the precondition times out.

    assert(forall{ (xs: List[T]) => p1(xs) ==> p2(xs) }) // fine!
    assert(lemmaV1(xss, p1, p2)) // FIXME precond. call lemmaV1 times out

However, if I additionally express Ɐxs: p1(xs) ==> p2(x) using a named function, here property, and add the assertion before calling the lemma, then stainless is happy and confirms that the precondition is valid. (Thanks @jad-hamza for the tip!)

So the question is, what conclusion should we take from this?

romac commented 6 years ago

It is indeed not clear to me as well why assert(lemmaV1[List[T]](xss, { _.size == length && length > 0 }, { _.size > 0 })) would verify but assert(lemmaV1(xss, p1, p2)) would not.

Looking at the generated VCs, we see the following for the first one:

//  - Now solving 'precond. (call lemmaV1[List[T]](xss, p1, p2))' VC for foo @30:12...
 (length > 0 && forall[List[T]](xss, (x$1: List[T]) => {
   require(true)
   size[T](x$1) == length && length > 0
 }) && lemmaV1[List[T]](xss, (x$2: List[T]) => {
   require(true)
   size[T](x$2) == length && length > 0
 }, (x$3: List[T]) => {
   require(true)
   size[T](x$3) > 0
 })) ==> {
   val p1: (List[T]) => Boolean = (xs: List[T]) => {
     require(true)
     size[T](xs) > 0
   }
   val p2: (List[T]) => Boolean = (xs: List[T]) => {
     require((size[T](xs) > 0) ==> (xs ≠ Nil[T]()))
     size[T](xs) > 0 && size[T](tail[T](xs)) == size[T](xs) - 1
   }
   ∀x$58: List[T]. p1.pre(x$58) && ∀x$59: List[T]. p2.pre(x$59) && ∀z: List[T]. (p1(z) ==> p2(z))
 }
// Solving with: nativez3
// - Result for 'precond. (call lemmaV1[List[T]](xss, p1, p2))' VC for foo @30:12:
//  => TIMEOUT

But this for the second one:

//  - Now solving 'body assertion' VC for foo @20:5...
 (length > 0 && forall[List[T]](xss, (x$1: List[T]) => {
   require(true)
   size[T](x$1) == length && length > 0
 })) ==> lemmaV1[List[T]](xss, (x$2: List[T]) => {
   require(true)
   size[T](x$2) == length && length > 0
 }, (x$3: List[T]) => {
   require(true)
   size[T](x$3) > 0
 })
// Solving with: nativez3
//  - Result for 'body assertion' VC for foo @20:5:
//  => VALID

In the second case, Stainless correctly infers the precondition require(true) for both of the lambdas.

But in the first case, Stainless infers the following precondition for p2:

require((size[T](xs) > 0) ==> (xs ≠ Nil[T]()))

And then tries to prove that:

∀x$58: List[T]. p1.pre(x$58) && ∀x$59: List[T]. p2.pre(x$59) && ∀z: List[T]. (p1(z) ==> p2(z))

which it does not seem to be able to do, most likely because p2 has an actual precondition (which we know is always true, but Stainless does not seem to be able to infer it in that case).

Defining p2 as

val p2 = { (xs: List[T]) =>
  require(true)
  xs.size > 0 && (xs.tail.size == xs.size - 1)
}

makes everything verifies.

I'm not exactly sure what is happening here, or why the generated VCs look so different, but this problem seems to be related to the discussion in https://github.com/epfl-lara/stainless/issues/92#issuecomment-339925700 and below.

romac commented 6 years ago

Ping @samarion @jad-hamza

jad-hamza commented 6 years ago

but this problem seems to be related to the discussion in #92 (comment) and below.

Good catch, it does seem to be related to the precondition inference. Adding an assertion assert(forall((xs: List[T]) => p2.pre(xs))) makes the assertion assert(lemmaV1(xss, p1, p2)) go through fine.

Just like this example shows, Inox can sometimes verify formulas A, B, C separately, but not the conjunction A && B && C. Could we make it so that everytime we want to verify a conjunction, multiple simpler VCs are generated, one for each conjunct? I guess when we verify A, we should add it as a premise to the verification of B, and then we should add A and B as a premise for the verification of C.

This has two benefits: 1) the verification will go through more often 2) when the verification fails, we can determine more easily which conjunct fails to verify.

This would greatly help when we are doing a precondition check for a function that has many requirements.

why the generated VCs look so different

I think it's because you're comparing the precondition check of lemmaV1 in one case, to the assertion in itself in the other case. The assertion ("body assertion") should always verify easily, because we are basically asserting true, since we know that lemmaV1 holds. But the precondition check times out in one case.

romac commented 6 years ago

I guess when we verify A, we should add it as a premise to the verification of B, and then we should add A and B as a premise for the verification of C.

As conjunction is commutative, shouldn't we add A && C as a premise for B, A && B for C, and B && C for A? Otherwise we might get different results when verifying A && B && C vs eg. B && A && C, no?

I think it's because you're comparing the precondition check of lemmaV1 in one case, to the assertion in itself in the other case. The assertion ("body assertion") should always verify easily, because we are basically asserting true, since we know that lemmaV1 holds. But the precondition check times out in one case.

Makes sense, thanks

jad-hamza commented 6 years ago

As conjunction is commutative, shouldn't we add A && C as a premise for B, A && B for C, and B && C for A?

I don't think so, otherwise we could verify "false && false && false".

The asymmetry could be explained by the fact that && is not symmetric in the language (e.g. l.length > 5 && l.get(4) would never lead a runtime error, but l.get(4) && l.length > 5 could).

romac commented 6 years ago

Right, makes sense. Forgot that ‘&&’ evaluates left to right 😅

On 3 Nov 2017, at 13:42, Jad notifications@github.com wrote:

As conjunction is commutative, shouldn't we add A && C as a premise for B, A && B for C, and B && C for A?

I don't think so, otherwise we could verify "false && false && false".

The asymmetry could be explained by the fact that && is not symmetric in the language (e.g. l.length > 5 && l.get(4) would never lead a runtime error, but l.get(4) && l.length > 5 could).

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

mantognini commented 6 years ago

Closing this one as it's related to the discussion in #92.