`prop` and `theorem` forget their quantifier?

robdockins commented 3 years ago

Looking at the datatypes involved in the high-level and low-level Spec datatypes (e.g. SpecItem), it seems to me that properties and theorems are both represented as a bare stream of booleans. It strikes me as odd that we don't retain information about which quantifier was used to state the property, as it makes it impossible later to determine what semantics where intended; in particular, the quantifier for the property is not passed to the askProver method of Copilot.Theorem.Prove.prove, which strikes me as very strange. Am I missing something here?

ivanperez-keera commented 3 years ago

I'm with you. I was also confused.

But I don't see how Spec could have a quantifier to give theorems a stream-based interpretation that also has operational meaning in online RV.

What do you suggest?

robdockins commented 3 years ago

I think the current stream-based interpretation is fine, as long as we also remember the associated quantifier. I don't know that I would necessarily expect a run-time interpretation of these properties, but they are certainly an important distinction when proving properties about a spec. It's important not to assume an existential property as though it is universal (which is unsound); and the proof techniques one can use for universal properties are different than for existential (induction vs model checking).

In principle, one could interpret a universal property as a safety property and monitor it, but the trigger mechanism already basically does that.

RyanGlScott commented 10 months ago

Here is a test case that demonstrates the issue:

{-# LANGUAGE NoImplicitPrelude #-}
module Main (main) where

import Data.Foldable
import Data.Functor

import Copilot.Theorem.What4
import Language.Copilot

trueThenFalses :: Stream Bool
trueThenFalses = [True] ++ constant False

spec :: Spec
spec = do
  void $ prop "forAll trueThenFalses (should be invalid)" (forAll trueThenFalses)
  void $ prop "exists trueThenFalses (should be valid)" (exists trueThenFalses)

main :: IO ()
main = do
  spec' <- reify spec
  results <- prove Z3 spec'
  forM_ results $ \(nm, res) -> do
    putStr $ nm <> ": "
    case res of
      Valid   -> putStrLn "valid"
      Invalid -> putStrLn "invalid"
      Unknown -> putStrLn "unknown"

As its name suggests, exists trueThenFalses (should be valid) should yield valid, but it instead yields invalid. I suspect this is happening because Copilot is dropping the existential quantifier and incorrectly treating the underlying stream as though it were universally quantified.

RyanGlScott commented 3 days ago

Here is the Kind2 equivalent of the program above:

{-# LANGUAGE NoImplicitPrelude #-}
module Main (main) where

import Data.Functor

import Copilot.Theorem.Kind2
import Copilot.Theorem.Prove
import Language.Copilot

trueThenFalses :: Stream Bool
trueThenFalses = [True] ++ constant False

spec :: Spec
spec = do
  void $ theorem "t1" (forAll trueThenFalses) (check (kind2Prover def))
  void $ theorem "t2" (exists trueThenFalses) (check (kind2Prover def))

main :: IO ()
main = void $ reify spec

$ PATH=~/Software/kind2-0.7.2/bin:$PATH runghc Bug.hs
(define-pred top
  ((prop-t1.out Bool)
   (prop-t2.out Bool)
   (s0.out Bool)
   (s0.out.1 Bool))
  (init
    (and
      (= prop-t1.out s0.out)
      (= prop-t2.out s0.out)
      (= s0.out true)
      (= s0.out.1 false)))
  (trans
    (and
      (= (prime prop-t1.out) (prime s0.out))
      (= (prime prop-t2.out) (prime s0.out))
      (= (prime s0.out) s0.out.1)
      (= (prime s0.out.1) false))))

(check-prop
  ((t2 prop-t2.out)))
t2: invalid ()
Finished: t2: proof failed
(define-pred top
  ((prop-t1.out Bool)
   (prop-t2.out Bool)
   (s0.out Bool)
   (s0.out.1 Bool))
  (init
    (and
      (= prop-t1.out s0.out)
      (= prop-t2.out s0.out)
      (= s0.out true)
      (= s0.out.1 false)))
  (trans
    (and
      (= (prime prop-t1.out) (prime s0.out))
      (= (prime prop-t2.out) (prime s0.out))
      (= (prime s0.out) s0.out.1)
      (= (prime s0.out.1) false))))

(check-prop
  ((t1 prop-t1.out)))
t1: invalid ()
Finished: t1: proof failed
Warning: failed to check some proofs.

Both the t1 and t2 proofs fail, but t2 ought to succeed.

Copilot-Language / copilot

`prop` and `theorem` forget their quantifier? #254