idris-lang / Idris2

A purely functional programming language with first class types
https://idris-lang.org/
Other
2.53k stars 376 forks source link

Type information is lost/changed when inlining a function (and it's not clear why) #3271

Open JavierGelatti opened 6 months ago

JavierGelatti commented 6 months ago

Note: I'm not sure whether this is a bug or I don't fully understand how some language feature is supposed to work (I'm currently learning Idris). If you find it's the latter, I'd greatly appreciate it if you could direct me to resources that can fill the gaps in my understanding :pray:

Steps to Reproduce

In this example, which —correctly— typechecks:

data IsEmpty : List Char -> Type where
    Empty : IsEmpty []

Uninhabited (IsEmpty (x::xs)) where
    uninhabited Empty impossible

decideIsEmpty : (cs : List Char) -> Dec (IsEmpty cs)
decideIsEmpty cs = case cs of
    [] => Yes Empty
    (x :: xs) => No uninhabited

decideIsEmptyString : (s : String) -> Dec (IsEmpty (unpack s))
decideIsEmptyString s = decideIsEmpty (unpack s)

replace the application of decideIsEmpty by its definition (i.e. inline the decideIsEmpty function in decideIsEmptyString). You're left with:

-- ...

decideIsEmptyString : (s : String) -> Dec (IsEmpty (unpack s))
decideIsEmptyString s = case unpack s of
    [] => Yes Empty
    (x :: xs) => No uninhabited

Note that the type of unpack s is List Char, and the expected type for the expression is still Dec (IsEmpty (unpack s)).

Expected Behavior

The resulting code is equivalent to the original program, so it should typecheck.

Observed Behavior

Both Yes Empty and No uninhabited stop being valid, and the program no longer typechecks. The errors are:

Where <<(unpack s)-implementation>> is:

if intToBool (primlt_Int (prim__sub_Int (primcast_IntegerInt (natToInteger (integerToNat (primcast_IntInteger (prim__strLength s))))) 1) 0) then [] else unpack' s (assert_smaller (primsub_Int (primcast_IntegerInt (natToInteger (integerToNat (prim__cast_IntInteger (primstrLength s))))) 1) (primsub_Int (prim__cast_IntegerInt (natToInteger (integerToNat (primcast_IntInteger (primstrLength s))))) 1 - 1)) s [assert_total (primstrIndex s (primsub_Int (primcast_IntegerInt (natToInteger (integerToNat (prim__cast_IntInteger (prim__strLength s))))) 1))]

Gist with complete program

...including the original decideIsEmptyString function and the results of inlining decideIsEmpty: https://gist.github.com/JavierGelatti/3025938e81471b625c8276c49ab3614d

dunhamsteve commented 6 months ago

This one also works:

decideIsEmptyString' : (s : String) -> Dec (IsEmpty (the (List Char) (unpack s)))
decideIsEmptyString' s with (unpack s)
    _ | [] = Yes Empty
    _ | (x :: xs) = No uninhabited

The issue feels subtle to me, but I think it boils down to Idris doesn't know to replace all of the unpack s occurrences with [] in the target type in the first branch (and x :: xs for the second branch). with blocks do make that substitution. If you put ?goal instead of Yes Empty and do :t goal in the repl, you get:

Main> :t goal
   s : String
------------------------------
goal : Dec (IsEmpty [])

More details

When you write a case statement, it is extracted to a helper function. The core language can only have case trees at the top of a top level function. (Dependent pattern matching is subtle.) In addition to the stuff in scope, the scrutinee unpack s is evaluated and passed as an argument to that function. But the return type of the helper function still says unpack s in it, so Idris doesn't know it's related to the argument.

The with blocks get similar treatment, but it is more aggressive about rewriting out the expression in the parens. It has replaced unpack s in the return type:

LOG declare.def.clause.with:3: With function type: (s : String) -> ({warg:0} : (Prelude.Basics.List Char)) -> (Prelude.Types.Dec (Main.IsEmpty {warg:0}[0]))

I don't know if it's intentional that case doesn't do this, I'm still learning how this stuff works and dependent pattern matching is subtle.

Also I don't know a good way to get a look at the types of those synthesized functions from the REPL, so I ran with --log 99 to get that last type.

JavierGelatti commented 6 months ago

@dunhamsteve thanks for your comment!

I didn't know that you could run the repl with --log 99 to get more information! It looks like a handy tool, as it's often difficult for me to debug these kinds of things "from the outside" (i.e. without cloning this repo and going through the execution of the typechecker itself).

I find it interesting that the code typechecks by using with instead of case, as you mentioned:

decideIsEmptyString' : (s : String) -> Dec (IsEmpty (unpack s))
decideIsEmptyString' s with (unpack s)
    _ | [] = Yes Empty
    _ | (x :: xs) = No uninhabited

It'd be nice to have some documentation to better understand what's the expected behavior of case vs. with, and why they should/shouldn't be replaceable (and in which contexts). In the docs, I could only find this page describing with, with no information or comparisons with case: https://docs.idris-lang.org/en/latest/tutorial/views.html

dunhamsteve commented 6 months ago

Yeah, it would be nice to have a discussion of case vs with in the docs. I'm not sure I know with well enough yet to make improvements. I did find the Agda documentation for with to be useful (note that Idris will have _ where Agda has ...). And the natToBin example on the views.html page shows how with can be used to refine the arguments of a function.

I don't know if this is a desired change, but just for fun I tried teaching case to replace the occurrences of the scrutinee in the return type of the generated function with the corresponding parameter. It seems to work with your example and passes tests:

diff --git a/src/TTImp/Elab/Case.idr b/src/TTImp/Elab/Case.idr
index 949df8274..290aa70f2 100644
--- a/src/TTImp/Elab/Case.idr
+++ b/src/TTImp/Elab/Case.idr
@@ -212,8 +212,13 @@ caseBlock {vars} rigc elabinfo fc nest env opts scr scrtm scrty caseRig alts exp
                                          fullImps caseretty_in (TType fc u)
          let casefnty
                = abstractFullEnvType fc (allow splitOn (explicitPi env))
-                            (maybe (Bind fc scrn (Pi fc caseRig Explicit scrty)
-                                       (weaken caseretty))
+                            -- replace scrutinee with scrn in return type
+                            (maybe !(do let binder = (Pi fc caseRig Explicit scrty)
+                                        let env' = binder :: env
+                                        nfscr <- (nf defs env' (weaken scrtm))
+                                        nfcaseretty <- (nf defs env' (weaken caseretty))
+                                        caseretty' <- Normalise.replace defs env' nfscr (Local fc Nothing _ First) nfcaseretty
+                                        pure $ Bind fc scrn binder caseretty')
                                    (const caseretty) splitOn)
          -- If we can normalise the type without the result being excessively
          -- big do it. It's the depth of stuck applications - 10 is already
gallais commented 6 months ago

Note that this leads to a lot of normalisation (and may be missing some occurrences in the context thus leading to ill-typed auxiliary definitions).

dunhamsteve commented 6 months ago

Thanks for taking a look. I suspected there were issues with this change, and I was curious what they were.

It looks like :doc with does mention the rewriting of the target type, but the web documentation does not:

When this intermediate computation additionally appears in the type of the
function being defined, the `with` construct allows us to capture these
occurences so that the observations made in the patterns will be reflected
in the type.

I think this paragraph would be a useful addition to the html docs and will PR that (along with documentation for multi-with).