idris-lang / Idris2

A purely functional programming language with first class types
https://idris-lang.org/
Other
2.46k stars 369 forks source link

RFC: Revamp the syntax & semantics of applications (again) #1181

Open Russoul opened 3 years ago

Russoul commented 3 years ago

The initial goal of this proposal was to fix #965 which originates from Idris 1 (as far as I can recall) and was further exacerbated by #820. It also (potentially) unblocks #907. Until #820 the problem could have been solved merely by tightening up the resolution of implicit arguments: make their position in application matter. As of #820 this solution alone is not viable, because we can't differentiate between positional implicit and flavourless named arguments.

That alone suggests that we need extra high-level syntax. At the same time, there have been many requests to supplement the surface language with implicit lambdas and such.

Thus, this proposal also tries to solve the problem of incompleteness of the surface language in full.

Two naive solutions that come to mind are:

Both solutions seem to be unsatisfactory (at least for me). There is a feeling of inconsistency that has its roots in the way haskellers and idrisians (hopefully I spelled that right) are used to writing applications. The notation carried over probably by category theorists to Haskell and subsequently to Idris is: f x as opposed to f(x) - the one we see in many popular languages.

According to this notation, f {x, y, z} or f <x, y, z> - general application with multiple arguments doesn't make sense: it follows the other notation. Apart from that, extra braces can be noisy at times.

We can do better.

First, let's see what kind of options we have, regarding the semantics of the pattern matching definitions. We are interested in the LHS expressions.

Any expression in the argument position of a pattern can be either of these:

At the same time we have three flavours of Pi:

Ideally we want to differentiate between positional and named arguments of each flavour. In addition to that, we want the syntax to preserve the intuition behind f x. Last but not least, we want it to be concise, preferably giving up on {}, <>.

The proposed solution (vaguely) is: x:y@(App) x:y$(App) x:y#(App)

where

x, y are lowercase identifiers, App is a possibly nullary application.

Mnemonically, we pick : to signify named arguments, @ - explicit arguments, $ - implicit arguments, # - auto arguments. x stands for the original name of the given Pi. y is a let-bound variable. App is a pattern application, e.g. Just var or ItIsJust.

The syntax is inseparable, in a sense that one can't insert whitespace between its parts (else it becomes ambiguous). It covers almost every combination there is, see the tables below.

PROPOSED

positional bind infer case mixed
explicit x or x@ _ or _@ or @_ X or (App) x@X or x@(App)
implicit x$ or $x _$ or $_ $X or $(App) x$X or x$(App)
auto x# or #x _# or #_ #X or #(App) x#X or x#(App)

CURRENT

positional bind infer case mixed
explicit x _ X or (App) x@X or x@(App)
implicit NONE NONE NONE NONE
auto @{x} @{_} @{App} @{x@(App)}

PROPOSED

named bind infer case mixed
any x: x:_ x:X or x:(App) NONE
explicit x:x@ x:_@ or x:@_ x:@X or x:@(App) x:y@X or x:y@(App)
implicit x:x$ x:_$ or x:$_ x:$X or x:$(App) x:y$X or x:y$(App)
auto x:x# x:_# or x:#_ x:#X or x:#(App) x:y#X or x:y#(App)

CURRENT

named bind infer case mixed
any {x} {x = _} {x = App} {x = y@App}
explicit NONE NONE NONE NONE
implicit NONE NONE NONE NONE
auto NONE NONE NONE NONE

As you can see, in the edge cases of the original syntax, e.g. in the positional auto mixed case, the syntax is pretty crippled and unintuitive. Conversely, in every instance of the proposed replacement we base on the same versatile core rule and strip off irrelevant parts of it. Almost every variation is realisable.

To befriend the new rule with the one for construction of Pis, we can write: $(x : a) -> b to stand for {x : a} -> b #(x : a) -> b to stand for {auto x : a} -> b $(x : a := t) -> b to stand for {default t x : a} -> b (I remember @ohad didn't like a similar one, but can't recall why ... 😅 ) Same logic applies to non-dependent function types.

Similar looking rule can be made for the updates: x:=y, x:=(Expr), x$=y, x$=(Expr)

where

x is any Pi compatible identifier, y is any identifier, Expr is any complex subexpression.

Updates are composed via regular function composition: x:=x' . y$=(const smth)

Let's not forget about lambdas: we interpret them as anonymous pattern matching clauses. Syntax is fully transferred over, mnemonically \ stands for an anonymous name of the definition.

\arg$ (Just x) num:22 => ?
 arg$ (Just _) num:_  => ?
 arg$ Nothing  num:_  => ?

Now to cope with the rest of #965, we tweak the semantics of named apps in the following way: Position of a named argument still matters. But that argument is not considered until all positional arguments to the left of it are resolved. As soon as that happens, the priority of the argument resolution is given to the named argument, regardless of the inspected Pi flavour (unlike now, where explicit arguments are always tried before any remaining named ones). That resolves the second part of #965, pertaining to the flaw in the named apps:

The first clause is no longer valid under the new semantics.

foo : (a : Void) -> {0 a : Type} -> a
-- The two following implementations of the previous function have the same meaning:
foo {a = b} v       = absurd v
foo {a = v} {a = b} = absurd v

Actually that priority rule was intended to be a part of #820. But implementing it turned out to mess up with the internals in a few places, so I decided to defer that.

I understand that the syntax related part of this PR is quite revolutionary (rebellious). When I first came to write up a proposal for named apps (for the first time), I hinged on the idea that we shouldn't break backward compatibility. But eventually I found out that sometimes we do that. I don't know if the proposed change is fruitful enough to justify that though, so I'd better leave that question to the reviewers 😅 . To mitigate the transition a little bit, we could iterate on the old syntax for some time, until we reach the point where majority of the users are familiar enough with the new one. Besides, the impacted syntax for the most part hasn't been used much yet (?) .

I'll attach a few example cases showing off the new style. If that feels too invasive and conclusively not worth it, we can still fix #965 by picking one of the two options listed at the top. Most of the incompleteness will be dealt with too. On the other hand, we will be left with @{x@(Foo bar)} and others.

The last thing: I volunteer to work on this in any case, as soon as the proposal is reviewed and (if) given the green light. So please share your thoughts, so we can wrap it up for good.

Examples:

import Data.Maybe
import Data.Vect

extract : (mb : Maybe a) -> {auto 0 prf : IsJust mb} -> a
extract (Just x) @{ItIsJust} = x

extract' : (mb : Maybe a) -> #(0 prf : IsJust mb) -> a
extract' (Just x) #ItIsJust = x

record Vec5 a where
  constructor MkVec5
  x : a
  #(y : a)
  $(z : a)
  @(w : a)
  v : a

-- See the `f x y z w v` vs `f{x, y, z, w, v}` intuition:

example2 : {auto _ : Maybe (Vec5 a)} -> Maybe a
example2 @{Just (MkVec5 {y, _})} = Just y
example2 @{Nothing} = Nothing

example2' : #(Maybe (Vec5 a)) -> Maybe a
example2' #(Just (MkVec5 y: _)) = Just y
--  note that this    -----^ is not the same as `y:_`,
-- `y: _` are two arguments, `y:_` is one.
example2' #Nothing = Nothing

fromVect5' : Vect 5 a -> Vec5 a
fromVect5' [x, y, z, w, v] =
  MkVec5 x: y: z: w: v: -- same as MkVec5 x:x y:y z:z w:w v:v
vs
  MkVec5 {x, y, z, w, v} -- same as MkVec5 {x=x, y=y, z=z, w=w, v=v}

fromVect5'' : Vect 5 a -> Vec5 a
fromVect5'' [a, b, c, d, e] =
  MkVec5 x:a y:b
         z:c w:d
         v:e
vs
  MkVec5 { x=a, y=b,
           z=c, w=d,
           v=e
         }

fromVect5''' : (a -> b) -> Vect 5 a -> Vec5 b
fromVect5''' [a, b, c, d, e] =
  MkVec5 x:(f a)
         y:(f b)
         z:(f c)
         w:(f d)
         v:(f e)
vs
  MkVec5 { x=f a
         , y=f b
         , z=f c
         , w=f d
         , v=f e
         }

fromVect5'''' : (a -> b) -> Vect 5 a -> Vec5 b
fromVect5'''' [a, b, c, d, e] =
  MkVec5 x:@(f a) y:#(f b)
         z:$(f c) w:@(f d)
         v:(f e)

match : Vect 5 a -> Vect 5 a
match $ MkVec5 x: y:
               z: w:
               v:     = MkVec5 v: w: z: y: x:

impLambdaId' : $(x : a) -> a
impLambdaId' = \x$ => x
--          or  $x

autoLambdaId' : #(x : a) -> a
autoLambdaId' = \x# => x
--           or  #x

impCaseLambda' : $(x : Nat) -> Bool
impCaseLambda' = \ #22 => True
                  _#   => False
 --           or   #_

flip : (a -> b -> c) -> (b -> a -> c)
flip = \f, y, x => f x y

flip' : (a -> b -> c) -> (b -> a -> c)
flip' = \f y x => f x y

-- contrived, but you get the idea :)
xor : Bool -> Bool -> Bool
xor = \case
        False =>
          \case False => False
                True => True
        True =>
          \case True => False
                False => True

xor' : Bool -> Bool -> Bool
xor' = \False False => False
        False True  => True
        True  False => True
        True  True  => False

updateXY : (x, y : Nat) -> Vec5 Nat -> Vec5 Nat
updateXY x y = { x := x
               , y := y
               , z $= (+3)}

-- will require to support delayed update elaboration
-- to handle things like update composition.
updateXY' : (x, y : Nat) -> Vec5 Nat -> Vec5 Nat
updateXY' x y = x:=x
              . y:=y
              . z$=(+3)
ShinKage commented 3 years ago

Adding to the proposal, I think the compiler must have a deterministic process to give a matchable name to unnamed implicits/explicit bindings. Most of the issues are really a problem of matching unnamed arguments, be it implicits or auto-implicits; with matchables names we can simply use the more sensible and intuitive syntax, without having to introduce lots of unused bindings due to positional arguments. The compiler must also return the given names when asking for type or documentation and it should have an interactive editing command to add implicits on pattern matching clauses or applications.

andylokandy commented 1 year ago

Sound like the root problem is that the named explicit arguments are ambiguous with the implicit arguments. Backward compatibility aside, any problem arising from forbidding name shadowing in the application list (each name is unique regardless of implicit/explicit)?

Russoul commented 1 year ago

Sound like the root problem is that the named explicit arguments are ambiguous with the implicit arguments. Backward compatibility aside, any problem arising from forbidding name shadowing in the application list (each name is unique regardless of implicit/explicit)?

I don't think this is consistent with the way (Pi) binders work in Idris2. Binders in Idris2 allow name shadowing. I don't see why application shouldn't cover that case as well.

andylokandy commented 1 year ago

Name shadowing in the type declaration is unencouraged AFAICS from the warning.

data IsBool: Bool -> Type where 
    IsTrue: IsBool True
    IsFalse: IsBool False

test : (a : Bool) -> (a : IsBool a) -> Bool
Warning: You may be unintentionally shadowing the following local bindings:
  a

src.Main:7:1--7:44
 3 | data IsBool: Bool -> Type where 
 4 |     IsTrue: IsBool True
 5 |     IsFalse: IsBool False
 6 | 
 7 | test : (a : Bool) -> (a : IsBool a) -> Bool
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Just very initial thoughts:

  1. We can either turn the warning into a hard error
  2. or make named application available only when there is no name shadowing.