diprism / perpl

The PERPL Compiler
MIT License
10 stars 5 forks source link

Separate (Haskell) type for tags, split Var into several types #103

Closed davidweichiang closed 1 year ago

davidweichiang commented 1 year ago

The [Type]s tg1 ... in TmVarG and TmCase are always TpVars, right? I'm not suggesting a change here.

_Originally posted by @ccshan in https://github.com/diprism/perpl/pull/102#discussion_r938392316_

davidweichiang commented 1 year ago

If we don't want unification to become more complicated, though, it should be easy to convert a tag to some kind of Type.

davidweichiang commented 1 year ago

So the minimal change would be something like

type Tag = String
data SProg = ...
    SProgFun Var [Tag] [Var] Term Type  -- or leave as Var?
data Term = ...
  | TmVarG GlobalVar Var [Tag] [Type] [Arg] Type
  | TmCase Term (Var, [Tag], [Type]) [Case] Type

and when it comes time to unify tag variables, we still wrap them inside TpVar?

ccshan commented 1 year ago

I mean like

newtype Tag = TgVar Var
data SProg = ...
    SProgFun Var [Var] [Var] Term Type
data Term = ...
  | TmVarG GlobalVar Var [Tag] [Type] [Arg] Type
  | TmCase Term (Var, [Tag], [Type]) [Case] Type

Using newtype makes the type checker enforce the distinction.

And possibly as a separate later change, replace the type Var by 4 newtypes: term variable, data type name, tag variable, and type variable. Then it seems a Subst should be made of 4 Maps.

Sorry I haven't tried it yet. If it's too verbose, it may help to define a custom Show instance and/or use {-# LANGUAGE OverloadedStrings #-}. But it might be clearer to avoid the latter.

davidweichiang commented 1 year ago

What will be the relationship between Tag = TgVar Var and the replacement for Var for tag variables?

ccshan commented 1 year ago

I would perform the replacement on the Var in newtype Tag = TgVar Var as well. It would make it explicit when we really want to coerce between sorts of variables. So verbose, I know....

davidweichiang commented 1 year ago

So eventually this would be something like

newtype TermVar = String -- lambda
newtype TermName = String -- define
newtype TypeVar = String
newtype TypeName = String -- data
newtype TagVar = String

data SProg =
    SProgFun TermName [TagVar] [TypeVar] Term Type
  | SProgData TypeName [TagVar] [TypeVar] [Ctor]

data Term = 
    TmVarL TermVar 
  | TmVarG TermName [Tag] [Type] [Arg] Type
  | ...
data Type =
    TpVar TypeVar
  | TpData TypeName [Tag] [Type]
  | ..
newtype Tag = 
    TgVar TagVar

?

It does seem verbose, and the naming could be confusing...

edited: tried to choose more regular names, added more examples

davidweichiang commented 1 year ago

Regarding subst, I suspect that each individual subst only operates on one kind of variable (though this needs to be checked) so instead of Subst having four (or five) maps, maybe there can be a single map?

Indeed, there seems to be something funny going on (https://github.com/diprism/perpl/issues/79#issuecomment-1190508602) that would be resolved by making subst and freeVars only operate on certain kinds of variables.

davidweichiang commented 1 year ago

I believe the usages of subst are limited to:

It is never used on tag vars or global defines.

But internally subst does a lot of alpha conversions (of term, type, and tag variables), and the Subst has to keep track of those too.

davidweichiang commented 1 year ago

Similarly, freeVars only looks for one kind of free variable at a time:

davidweichiang commented 1 year ago

I'm giving this a try (just the Type/Tag distinction for now). It is more involved than I thought...

ccshan commented 1 year ago

Thanks! Are you as hopeful as I am about splitting up Var?

davidweichiang commented 1 year ago

I think it will work. I started to at least make Var into a newtype in a separate branch which I haven't pushed yet, but hit a wall at the FGG code. So I switched over to cleaning up the FGG code in #105.

I can also push the branch where Var is being made into a newtype.

davidweichiang commented 1 year ago

I don't know how to get started with this, so am hoping @ccshan will. But #111 is meant to be helpful.

Subst.hs will be the most complicated part of splitting Var, and will have to settle the question of what can be substituted (UsVar and TmVarL can; TmVarG can't).