epfl-lara / leon

The Leon system for verification, synthesis, repair
Other
161 stars 49 forks source link

Bug in proofs #279

Open BLepers opened 7 years ago

BLepers commented 7 years ago

Pretty serious bug, or did I miss something?

import leon.collection._
import leon.lang._
import scala.Any

object Scheduler {
   def a(i: BigInt):BigInt = {
      i - 1;
   } ensuring(res => b(i) && res == 2)

   def b(i: BigInt): Boolean = {
      val c = a(i);
      (i == 1) ==> (c == 0)
   }.holds
}

(This code verifies on b3ad5fc57a7469bd8f3b3afbe18a56b54f9d8f3b, but shoudn't because a doesn't always return 2...)

samarion commented 7 years ago

Leon verification is only sound modulo termination.

If you run your example through the termination checker (--termination argument), it will give you an input to a or b that introduces non-termination.

Cheers, Nicolas

On 09. 12. 16 17:13, Baptiste Lepers wrote:

Pretty serious bug, or did I miss something?

import leon.collection. import leon.lang. import scala.Any

object Scheduler { def a(i:BigInt):BigInt = { i- 1; } ensuring(res=> b(i)&& res== 2)

def  b(i:BigInt):  Boolean  =  {
   val  c  =  a(i);
   (i==  1)==>  (c==  0)
}.holds

}

(This code verifies on b3ad5fc https://github.com/epfl-lara/leon/commit/b3ad5fc57a7469bd8f3b3afbe18a56b54f9d8f3b, but shoudn't because a doesn't always return 2...)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/epfl-lara/leon/issues/279, or mute the thread https://github.com/notifications/unsubscribe-auth/ADKzhG5yPfrcvdoETeKQjET44vCebWEpks5rGX4-gaJpZM4LJGKZ.

larsrh commented 7 years ago

Strangely enough the Isabelle solver thinks a and b are terminating. Leon claims that b(0) doesn't terminate but Isabelle evaluates that to false. (I'm guessing because i == 1 evaluates to false, so the conclusion is never actually executed.)

BLepers commented 7 years ago

Ok... Would it be possible to give an error in that case? It is a bit strange to see that this code passes verification :)

samarion commented 7 years ago

Well, detecting non-termination and/or proving termination can be quite expensive (typically much more so than verification). The extra wait time would probably be a significant pain in verification efforts.

Maybe it would be useful to run a dumbed-down and faster version of the termination checker to try to detect potential non-termination, but in practice you would probably get lots of spurious false positives or miss real cases of non-termination.

We would need a set of "concrete" benchmarks where verification was hurt because of non-termination to determine what checks we should be running.

@lars: this is pretty weird. What kind of operational semantics does Isabelle use for this case? Does the example pass verification too?

Cheers, Nicolas

On 09. 12. 16 17:29, Baptiste Lepers wrote:

Ok... Would it be possible to give an error in that case? It is a bit strange to see that this code passes verification :)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/epfl-lara/leon/issues/279#issuecomment-266057103, or mute the thread https://github.com/notifications/unsubscribe-auth/ADKzhEHMpRCkTXppFS571BIqYKpwhUr0ks5rGYHtgaJpZM4LJGKZ.

larsrh commented 7 years ago

It turns out my assessment was wrong. Here's what Isabelle sees:

fun a :: "int ⇒ int" and b :: "int ⇒ bool" where
"a i = i - 1" |
"b i = (i = 1 ⟶ a i = 0)"

Hence, the function definitions themselves are definitely terminating, because there is no call to b in a (the Isabelle backend will – by default – not include the pre- and postconditions in the function definition). If the functions are annotated with @isabelle.fullBody, then Isabelle sees the call to b and correctly refuses to define the function. I'm unsure what to make of all this.

larsrh commented 7 years ago

In essence, the question is: How do you deal with pre- and postconditions introducing complications in the call graph?

larsrh commented 7 years ago

By the way, the Isabelle backend is supposed to always give sound results even without --termination switched on. So there is definitely an issue here somewhere.

vkuncak commented 7 years ago

I think that @BLepers is not specifically talking about the Isabelle back end. Termination checker should be on by default and turning it off explicitly should be allowed but known to cause soundness issues if the functions are not terminating. @BLepers , let us know what termination checker reports and if it is too weak in some cases.

BLepers commented 7 years ago

The termination checker says that a does not finish:

[  Info  ] ║ apply       ✓ Terminates (Non-recursive)                                           
[  Info  ] ║ apply       ✓ Terminates (Non-recursive)                                          
[  Info  ] ║ a           ✗ Non-terminating for call: a(0)                                     
[  Info  ] ║ b           ✗ Calls non-terminating functions a 

So I guess it makes sense that the verification phase is not sound for that example (even though it should probably fail by default as Viktor said because errors like that can easily be made).

What I find a bit confusing in that example is that a() never "really" calls b(). I.e., if you run the applicative code without the "ensuring", there is no infinite loop. (Basically I just tried to separate a bit the proof logic - b() - and the applicative code - a()).

samarion commented 7 years ago

The guarantee Leon is giving you when it says "valid" is that if you run the code, every time you encounter a contract, it won't evaluate to false. In this case, Leon is telling you the contract of a() (namely b()) will never evaluate to false (which is true since it won't terminate). However, you didn't guarantee that the contract would evaluate to true, which is typically what you would want when you say something is verified.

Note that with termination (and a few other checks that guarantee the program won't crash), never evaluating to false is equivalent to always evaluating to true.