ucsd-progsys / liquidhaskell

Liquid Types For Haskell
BSD 3-Clause "New" or "Revised" License
1.18k stars 134 forks source link

Don't require datatype specifications which are copies of the Haskell declarations #1904

Open facundominguez opened 2 years ago

facundominguez commented 2 years ago

Verifying the following program fails

data T = T { getT :: Int }

{-@ f :: {v:T | getT v > 0} -> Int @-}
f :: T -> Int
f (T t) = t

main :: IO ()
main = print $ f (T 1)

with error

Test.hs:8:10: error:
    • Illegal type specification for `Main.f`
    Main.f :: v:{v : Main.T | Main.getT v > 0} -> GHC.Types.Int
    Sort Error in Refinement: {v : Main.T | Main.getT v > 0}
    Unbound symbol Main.getT --- perhaps you meant: Main.T ?
    Just
    • 
  |
8 | {-@ f :: {v:T | getT v > 0} -> Int @-}
  |          ^^^^^^^^^^^^^^^^^^^^^^^^^^

Adding a specification to the datatype fixes the error.

{-@ data T = T { getT :: Int } @-}

The fix is easy enough, but the problem is that it is tiresome repetition in an existing codebase with many datatypes.

Is there any chance we can have liquidhaskell pickup data-types which don't have an explicit specification?

ranjitjhala commented 2 years ago

There is some option that does this (I think) @nikivazou do you recall?

On Fri, Nov 12, 2021 at 2:00 PM Facundo Domínguez @.***> wrote:

Verifying the following program fails

{-@ LIQUID "--exact-data-cons" @-}

data T = T { getT :: Int }

{-@ f :: {v:T | getT v > 0} -> Int @-} f :: T -> Int f (T t) = t

main :: IO () main = print $ f (T 1)

with error

Test.hs:8:10: error: • Illegal type specification for Main.f Main.f :: v:{v : Main.T Main.getT v > 0} -> GHC.Types.Int Sort Error in Refinement: {v : Main.T Main.getT v > 0} Unbound symbol Main.getT --- perhaps you meant: Main.T ? Just •
8 {-@ f :: {v:T getT v > 0} -> Int @-}
^^^^^^^^^^^^^^^^^^^^^^^^^^

Adding a specification to the datatype fixes the error.

{-@ data T = T { getT :: Int } @-}

The fix is easy enough, but the problem is that it is tiresome repetition in an existing codebase with many datatypes.

Is there any chance we can have --exact-data-cons pickup data-types which don't have an explicit specification?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ucsd-2Dprogsys_liquidhaskell_issues_1904&d=DwMFaQ&c=-35OiAkTchMrZOngvJPOeA&r=r3JfTqNkpwIJ1InE9-ChC2ld7xwATxgUx5XHAdA0UnA&m=yEB97bR_pt3jOuLe1i1x45rnUya53f0SjaglN5IG2iC8Bpfqx77I6SHG64VKUwz9&s=u0wlffdQhqiFsDR0LqSqFDcpBnd5vK5TL6yUm7qzFKk&e=, or unsubscribe https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AAMS4OE54C6ZCK2PI2Q3AWTULWE55ANCNFSM5H5YQKPA&d=DwMFaQ&c=-35OiAkTchMrZOngvJPOeA&r=r3JfTqNkpwIJ1InE9-ChC2ld7xwATxgUx5XHAdA0UnA&m=yEB97bR_pt3jOuLe1i1x45rnUya53f0SjaglN5IG2iC8Bpfqx77I6SHG64VKUwz9&s=FlMMPGQFN7HceIi_DbwDFF0A_Zl6C7Z0PvnHWgkfbAo&e= . Triage notifications on the go with GitHub Mobile for iOS https://urldefense.proofpoint.com/v2/url?u=https-3A__apps.apple.com_app_apple-2Dstore_id1477376905-3Fct-3Dnotification-2Demail-26mt-3D8-26pt-3D524675&d=DwMFaQ&c=-35OiAkTchMrZOngvJPOeA&r=r3JfTqNkpwIJ1InE9-ChC2ld7xwATxgUx5XHAdA0UnA&m=yEB97bR_pt3jOuLe1i1x45rnUya53f0SjaglN5IG2iC8Bpfqx77I6SHG64VKUwz9&s=TRs1b1dPcyYj3m4_qqvhTtMjJG5QReveNngAnRMnG0A&e= or Android https://urldefense.proofpoint.com/v2/url?u=https-3A__play.google.com_store_apps_details-3Fid-3Dcom.github.android-26referrer-3Dutm-5Fcampaign-253Dnotification-2Demail-2526utm-5Fmedium-253Demail-2526utm-5Fsource-253Dgithub&d=DwMFaQ&c=-35OiAkTchMrZOngvJPOeA&r=r3JfTqNkpwIJ1InE9-ChC2ld7xwATxgUx5XHAdA0UnA&m=yEB97bR_pt3jOuLe1i1x45rnUya53f0SjaglN5IG2iC8Bpfqx77I6SHG64VKUwz9&s=ppkU7r1eZKOOVy0LKR-4u8ym9WS1-6q-PGIZ_6e90E0&e=.

facundominguez commented 2 years ago

Actually --exact-data-cons isn't necessary either in this example. I'm editing the original question to eliminate the noise.

nikivazou commented 2 years ago

I do not think there is already a flag to do that, but we should add it! (exact-data-con generates its own selectors, but we should/can change it to generate the user defined ones).

Fizzixnerd commented 2 years ago

Hi there!

Intro

I have two proposed solutions for this issue, but believe the second to be the way to go.

I have highlighted lines between code blocks that changed by surrounding them by -----.

Solution A

A solution would be to simply promote data constructors with Haskell types to the logic in the most naive way (i.e., without referencing in scope refinements), so that:

{-@ type Nat = { v : Int | v >= 0 } @-}
type Nat = Int

data Nat' = Nat' { getNat' :: Nat }

{-@ f :: { v:Nat' | getNat' v > 0 } -> Int @-}
f :: Nat' -> Int
f (Nat' n) = n

main :: IO ()
main = print $ f (Nat' 1)

would be equivalent to

{-@ type Nat = { v : Int | v >= 0 } @-}
type Nat = Int

-------------------------------------------
{-@ data Nat' = Nat' { getNat' :: Int } @-}
-------------------------------------------
data Nat' = Nat' { getNat' :: Nat }

{-@ f :: { v:Nat' | getNat' v > 0 } -> Int @-}
f :: Nat' -> Int
f (Nat' n) = n

main :: IO ()
main = print $ f (Nat' 1)

where I have inlined the definition of the Haskell type alias Nat to make it clear that it does not refer to the LH Nat.

Issues

Nat' contains an unrefined Int

It is my understanding that there is an implicit connection between the Refinement Type of Nat and the Haskell type of Nat, with the Refinement Type taking precedence automatically in LH signatures. This is witnessed by the following compiling program:

{-@ type T1 = { v : Int | v >= 0 } @-}
type T1 = Int

type T2 = Int

{-@ g :: T1 -> Int @-}
g :: T1 -> Int
g x = x

{-@ h :: T2 -> Int @-}
h :: T2 -> Int
h x = x

The point is that even though T2 does not have an LH signature, it can still be referenced from LH signatures. Likewise, when T1 is defined with an LH signature, the LH signature shadows the Haskell type of T1 in other LH signatures.

I think the solution outlined here would therefore cause much confusion to users, as someone writing data Nat' = Nat' { getNat' :: Nat } would likely assume it was a refined Nat that would be used in the automatically generated definition. This would lead to programs which were assumed to be safe, but which actually aren't and would be especially dangerous to those just getting into LH.

Therefore, a think the better solution is:

Solution B

Whereby

{-@ type Nat = { v : Int | v >= 0 } @-}
type Nat = Int

data Nat' = Nat' { getNat' :: Nat }

{-@ f :: { v:Nat' | getNat' v > 0 } -> Int @-}
f :: Nat' -> Int
f (Nat' n) = n

main :: IO ()
main = print $ f (Nat' 1)

is interpreted as

{-@ type Nat = { v : Int | v >= 0 } @-}
type Nat = Int

-------------------------------------------
{-@ data Nat' = Nat' { getNat' :: Nat } @-}
-------------------------------------------
data Nat' = Nat' { getNat' :: Nat }

{-@ f :: { v:Nat' | getNat' v > 0 } -> Int @-}
f :: Nat' -> Int
f (Nat' n) = n

main :: IO ()
main = print $ f (Nat' 1)

where the Nat referenced in the generated LH signature is the Refinement Type Nat that shares a name with the Haskell type alias Nat. If the LH signature for Nat were removed, the code would still compile, but the Nat in the LH data decl would then refer to an unrefined Nat from Haskell.

Conclusion

I think solution B is the better of the two as it more closely aligns with how LH already works vis a vis Refinement Types with shared names with Haskell types.

Fizzixnerd commented 2 years ago

Hello again!

I have been hard at work on this issue, but I think it may be best to take a different approach than the one originally agreed upon, and I wanted to run a new plan by everyone.

The two parts of the plan are:

For some reason, I was initially under the impression that Part 2 was basically already done by Liquid Haskell (and I also had some misconceptions about how LH actually worked). I have since learned this is not the case, and I think it's best to proceed in two parts to stop the PR from becoming enormous.

Details of Part 2

The idea of Option B was that any refined type alias that shared a name with a regular Haskell alias would use the refined type alias when automatically "lifting" the data declaration to the logic, even without a specification. I would like to extend this to all definitions within a file (notably, all functions). In this way, a refined type alias acts as a sort of invariant for your program. This would change the behavior of existing LH programs, however. Indeed, Solution B alone had this side-effect, but I wanted to ensure this was okay in the expanded case too. I personally think that adding a new flag --no-implicit-association (or something to that effect) as an "opt out" of this new behavior would be the way to go, since I would imagine most users would find this behavior helpful.

Rationale for Part 2

There are two reasons I think we should expand Solution B into Part 2.

The first is that it would be strange for a user to see

{-@ type Nat = {v : Int | v >= 0 } -@}
type Nat = Int

data T = T { getT :: Nat }

and be able to assume that the Nat in the data decl is a refined Nat, but couldn't do the same for the following function in the same file:

f :: T -> Nat
f = getT

With only Solution B and not the rest of Part 2, this would end up being the case, and I think this would be highly confusing to users.

The second (lesser) reason is that it enables users to gradually adopt LH, while at the same time quickly reap its benefits by just adding a couple of specifications to their already existing type aliases. That is, with a similar file as before:

-- I just added this to check out LH and see if it works!
{-@ type Nat = {v : Int | v >= 0 } -@}
type Nat = Int

data T = T { getT :: Nat }

f :: T -> Nat
f = getT

-- LH will reject this!
g :: Int -> Nat
g = getT . T

The user gets lots of benefits from LH quickly and without having to go through their entire source base and annotate all their functions.

Hope I haven't misunderstood something important! Let me know what you think.

Fizzixnerd commented 2 years ago

It seems I misunderstood something important! Liquid does indeed have support for Part 2 outside of data decls. The test I was using was wrong, so sorry!

facundominguez commented 2 years ago

That's good to know. For those of us less involved with the implementation, it could be helpful if you link the functions and/or explain a bit more the features that you encountered which are necessary here. In other words what does "have support for Part 2" mean?