hackworthltd / primer

A pedagogical functional programming language.
GNU Affero General Public License v3.0
14 stars 1 forks source link

FR: Saturated constructors #870

Closed brprice closed 1 year ago

brprice commented 1 year ago

Description

This is a feature request.

Change how we handle constructors to require them to always be fully saturated, unlike functions. As part of this change, they will become checkable, rather than synthesisable, and will not carry the indices of their datatype -- we will have Maybe Bool ∋ Just True rather than Just @Bool True ∈ Maybe Bool. This will affect both the internals and the API.

Motivation

Currently there is not much symmetry between introduction and eliminations of ADTs. Patterns must be fully saturated, but constructors might not be: case (map Just [True, False]) of [] -> ... ; (x:xs) -> ...; has Just being undersaturated, but both [] and x:xs are fully saturated (here I am ignoring polymorphism/type applications -- just pretend these types are monomorphic). When we represent terms as trees in the API, it is weird to have application nodes in patterns, as we are not really matching on "application" -- indeed, one cannot write a pattern like case (not $ True) of f $ x -> x (which one may imagine was expressible and would evaluate to True). Similarly, when constructing a polymorphic type, one must give the index, but when (deep) matching one cannot match on it, and it is not clear what the syntax/semantics should be here (other than eliding it from the match): case (C @a t) of C @_ True -> e1; C @_ x -> e2 is fine (when Bool ∋ t), but having @_ is odd (one can't give it a fresh name (without introducing a type equality); primer could automatically write the correct, concrete, index there, but that is odd as one cannot manually specify a pattern there contrary to True and x). There are also no "pattern actions" (and will not be, even when/if we had a rich notion of "pattern actions") applicable to these nodes.

Note that this fully-saturated-ness means that some terms will become much more concise. For example, writing the list [1,2,3] out explicitly today is Cons @Int 1 (Cons @Int 2 (Cons @Int 3 (Nil @Int))), but would become Cons 1 (Cons 2 (Cons 3 Nil)). Similarly the nested pairs (('a',1),(True, Λa.λx.x)) would today be MkPair @(Pair Char Int) @(Pair Bool (Λa. a -> a)) (MkPair @Char @Int 'a' 1) (MkPair @Bool @(Λa.a->a) True (Λa.λx.x)), but would become MkPair (MkPair 'a' 1) (MkPair True (Λa.λx.x))

Dependencies

Spec

Constructors will now be always fully-saturated, and only checkable, not synthesisable (although, see "Spec questions"). This will affect both the core language, typechecker and evaluator, as well as the API and actions. Instead of having (roughly)

data Expr =
  | Con ValConName
  | App Expr Expr
  | APP Expr Type
  | ...

where Just True is represented as (Con "Just" `APP` TCon "Bool") `App` Con "True" we would have

data Expr =
  | Con ValConName [Expr]
  | App Expr Expr
  | APP Expr Type
  | ...

where Just True is represented as Con "Just" [Con "True" []]. The typing rule would change from

data T a = C A[a] | ... is defined
----------------------------------
Con C ∈ ∀a. A[a] -> T a

to

data T a = C A[a] | ... is defined
A[S] ∋ t
----------------------------------
T S ∋ C [t]

(Note that there is only one rule per constructor, applying when that constructor is fully saturated) We can also relax the stipulation that constructor names are globally unique, if desired. (However, see "Spec questions")

Evaluation does not need to change, other than to adapt to the changed Expr definition (since all eliminations of ADTs are already fully-saturated, and the typechecker will rule out any Apps with a constructor in function position -- this won't be guaranteed by Haskell's type system, so to handle this "impossible" path we could just throw an error; I don't suggest appending to the list, since that wouldn't work for APP nodes and may delay detection of bugs.)

The API will change to reflect this core change. The "raw"/"rich" api will have the simple reflection. The openapi will change how it outputs trees -- the Tree type will not change, but we will now emit different trees:

Tree {flavor=App,body=NoBody,childTrees=[ Tree {flavor=APP
                                               ,body=NoBody
                                               ,childTrees=[
                                                    Tree {flavor=Con,body="Just",childTrees=[]}
                                                  , Tree {flavor=TCon,body="Bool",childTrees=[]}
                                                  ]}
                                        , Tree {flavor=Con,body="True",childTrees=[]}]}`

will now become

Tree {flavor=Con,body="Just",childTrees=[Tree {flavor=Con,body="True",childTrees=[]}]}

Actions will change since some actions no longer make sense. The "insert a constructor" and "insert a refined constructor" actions will be removed, but the "insert a saturated constructor" one will stay (perhaps being renamed). Possibly we could not offer the constructor action in a hole with function type, or maybe have a fancy "insert an eta-expanded constructor" (see Spec questions). We should possibly not offer the x ~> x $ ? action on a constructor (since it would only ever do C s t ~> {? (C s t) $ ? ?}).

Spec questions

This FR calls for constructors being checkable (only), and fully-saturated. There is an argument that this would be bad UX (note that some of the implications are discussed below), in which case we have three options

  1. do not implement this FR
  2. implement this FR anyway, if this poor UX is overwhelmed by better UX and pedagogy due to the enhanced symmetry between constructions and eliminations/patterns.
  3. modify this FR to improve matters

To attempt (3), there are two things to try. (Though, to be clear, I am not convinced whether we are in world (3), or whether these suggestions improve matters.)

Potentially (it seems plausible, but I haven't thought hard about it) that we could make fully-applied constructors synthesisable, assuming that they are uniquely named. However, we would not be able to (without type inference, and even then not in all cases) infer the indices/parameters, but we could synthesise a hole-y type: Just ? ∈ Maybe ? and Just True ∈ Maybe ? are easy (note that the second could be improved with inference, but the first could not be).

To lessen the burden of not having partially-applied constructors, we could generate/sugar (Coq-style) a function for each constructor which does the "eta-expansion": Cons x xs being the pedantic constructor which must be saturated and is only checkable, and cons : ∀a. a → List a → List a being a normal top-level definition, defined as cons = Λa. λx. λxs. Cons x xs which can be partially applied and is synthesisable. Notice that cons is polymorphic (and can/must be applied to a type argument). However, it does seem plausible that it would lead to more confusion.

Implementation details

Since this directly touches the core Expr type, it is difficult to see how to break the work into small pieces. I suspect that it is possible to mostly decouple the OpenAPI changes from the core changes. We could either modify the API first (have a special case that detects constructor-headed spines -- this would not ensure fully-saturated-ness, but this is only a temporary stopping point), or last (render Cons x xs as a tree headed by APP -- we should have all the type information available from the TC, and can always put holes in the types if needed.) Whilst this decoupling is fairly pointless on its own (the OpenAPI changes are small), it does enable us to split the core changes into finer pieces without much externally-visible API churn. In the core, we could split into steps:

Not in spec

This spec does not cover any "fancy" features that are not a part of current primer. At first glance, I don't foresee any problematic interactions with future extensions. However, I wanted to point out one feature of this current spec is that constructors and (symmetrically) patterns will never contain any type( variable)s. This would not be the case if we had existentials/GADTs (i.e. constructors which contain type arguments which are not parameters of the constructor's datatype). For example, Haskell's

data WrappedAny where
   Wrap :: forall a . a -> WrappedAny

could be translated by

TYPE ∋ A 
A ∋ a
---------------------
WrappedAny ∋ Wrap A a

e ∈ WrappedAny
α:*, a:α ⊢ T ∋ t
-------------------
T ∋ match e with
      Wrap α a -> t

Discussion

This change will have implications for what programs are accepted.

The fact that constructors are only checkable means that one can put them in fewer positions:

The fact that constructors are only fully-applied means that they act less like functions. In particular, one has to eta-expand more often. The clearest example is when using map: one has to write map (λx.Just x) [True,False].

Future work

dhess commented 1 year ago

Thanks! This is very helpful. Let's discuss in the next developer meeting. One thing I'd like to understand better is how the "worse UX" of this proposal manifests. (I think you have pointed out at least some examples of this in the Discussion section.)

georgefst commented 1 year ago

This is great. The motivation section explains the upsides better than any of us have managed before now.

The "insert a constructor" and "insert a refined constructor" actions will be removed, but the "insert a saturated constructor" one will stay (perhaps being renamed).

Naming the remaining action simply "insert a constructor" seems the obvious choice.

Possibly we could not offer the constructor action in a hole with function type

Wouldn't this be inconsistent with how we treat other actions? For example, we currently allow inserting a variable with non-function type here. Maybe it would be okay in beginner mode.

We should possibly not offer the x ~> x $ ? action on a constructor (since it would only ever do C s t ~> {? (C s t) $ ? ?}).

Agreed. Though again, I worry about this being inconsistent with other non-function expressions.

To lessen the burden of not having partially-applied constructors, we could generate/sugar (Coq-style) a function for each constructor which does the "eta-expansion"

My preference here would be, if anything, to have an action for generating these functions on the fly (this may be what you meant elsewhere by "insert an eta-expanded constructor"). This is essentially equivalent to your proposal but with the functions (which would be small) always inlined, avoiding pollution of the namespace.

This FR calls for constructors being checkable (only), and fully-saturated. There is an argument that this would be bad UX (note that some of the implications are discussed below)

The fact that constructors are only checkable means that one can put them in fewer positions

This is all a bit concerning. Like @dhess, I'll wait until our dev meeting for some clarifications here, as I don't currently fully understand all the trade-offs, including exactly how inference differs from synthesis.

Spitballing a little, perhaps we should require that constructors are also "saturated" with type arguments, if this is enough to recover synthesisability? This might also help with your point about extending to GADTs, though I haven't thought this through. It would, however, negate the section in the motivation about type applications in patterns being odd, and wanting to get rid of them. EDIT: I've gone off this idea. We'd also lose the fact that expressions like Cons 1 (Cons 2 (Cons 3 Nil)) get much simpler (which should maybe be in the motivation section). Besides, after today's meeting, I realise that the current situation with the occasional annotation needing to be inserted, at worst, isn't as bad as I feared.

brprice commented 1 year ago

(These are mostly notes to myself)

The current view on this FR is that it is worth implementing so we can have an actual artifact to inspect and decide if the UX tradeoffs are worth it. We expect that they will be, so this implementation should be with a view to merging.

When implementing I should

Some extensions/metatheory to consider

georgefst commented 1 year ago

Something that just occurred to me: could we ensure that the new API assigns IDs to pattern constructors, as we already do for pattern variables?

This would make things easier for the frontend, since then every node could be selectable (except for pattern boxes, but those don't really look like normal nodes anyway), and it would open the door for actions to be performed on them, such as re-ordering. This would have been less achievable if we kept pattern-application nodes.

dhess commented 1 year ago

I am 100% on board for that proposal. It's never been right that we can't select pattern constructors, even if we can't do anything with them.

georgefst commented 1 year ago

I'd kind of like to find some sort of action that we could offer on them, just so that the action panel is never empty. But I can't think of anything useful.

dhess commented 1 year ago

This is getting a bit off-topic for this FR, but in lieu of a full-blown pattern language, which we've agreed should probably be a Primer 2.0 feature, a nice middle ground might be to support a single wildcard pattern. I can imagine it working something like this:

  1. Select one or more constructors that you want to replace in a given match with expression (shift-click to add them one at a time, or maybe drag a selection box around them, if they're contiguous).
  2. As soon as one or more are selected, offer a "Replace constructor with wildcard" action.
brprice commented 1 year ago

Something that just occurred to me: could we ensure that the new API assigns IDs to pattern constructors, as we already do for pattern variables?

I think this is orthogonal to this FR: I am not proposing to touch patterns at all. These changes mean actual constructions will visually look like patterns, but this achieved by modifying constructors-in-terms, and not touching pattern-constructors.

That said, I would like a pattern language -- at the very least we need the ability to select a finite number of alternatives and match the rest with a wild card to be able to match on primitive numbers. I merely think it should be done in its own FR. (I would be happy to work on that after this one is implemented, and have some thoughts on how it may work, or could leave this for someone else.)

georgefst commented 1 year ago

I think this is orthogonal to this FR

Fair enough, I'll take your word for it. I had assumed that this would be a good time to do it, since we're already needing to modify the trees we output for patterns (to remove application nodes, which currently we also have to generate mock IDs for on the frontend).

brprice commented 1 year ago

I think this is orthogonal to this FR

Fair enough, I'll take your word for it. I had assumed that this would be a good time to do it, since we're already needing to modify the trees we output for patterns (to remove application nodes, which currently we also have to generate mock IDs for on the frontend).

Ah, I see what you are getting at now. To clarify a few points:

georgefst commented 1 year ago

Got you. Yes, I was talking about adding IDs to patterns in the core, and now that I think about it I understand that patterns aren't actually changing here.

dhess commented 1 year ago

For posterity, in this comment, I'm adding our notes from our 2023-02-15 developer meeting, during which we discussed this FR. (Some of these notes may already be incorporated into the FR proposal and/or other comments in the thread, but I don't want to miss anything, at the risk of being redundant.)

Potential UX issues:

brprice commented 1 year ago

This FR is implemented in #958, #959, #960, #961 and #962

brprice commented 1 year ago

Possibly we could not offer the constructor action in a hole with function type, or maybe have a fancy "insert an eta-expanded constructor" (see Spec questions).

This is not implemented. I have opened #975 for this idea