diprism / perpl

The PERPL Compiler
MIT License
10 stars 5 forks source link

Merge Ctxt and Env #91

Closed davidweichiang closed 1 year ago

davidweichiang commented 1 year ago

This closes #90, merging the cases of Env into the cases of Ctxt, but it's not ready yet because Ctxt is now kind of messy. Here's a summary of all the cases (tp = type, ps = type variables, tgs = tag variables, cs = constructors).

User-level Scheme-ified Elaborated
local var DefTerm ScopeLocal tp DefTerm ScopeLocal tp DefTerm ScopeLocal tp
define DefTerm ScopeGlobal tp DefSTerm ScopeGlobal (Forall tgs ps tp) DefTerm ScopeGlobal tp
extern DefTerm ScopeGlobal tp DefSTerm ScopeGlobal (Forall tgs* [] tp) DefTerm ScopeGlobal tp
constructor DefTerm ScopeCtor tp DefSTerm ScopeCtor (Forall tgs ps tp) DefTerm ScopeCtor tp
datatype DefData ps cs DefSData tgs ps cs DefData [] cs
type var ** ** **

*An extern can only have tags if it has a type parameter that is ignored. In practice I think we just store [] for the tags. **We currently just don't store type variables in Ctxt.

Is there a way to simplify this?

ccshan commented 1 year ago

Is it time to merge at least merge SProg and Prog into some type constructor that takes an argument that can be () or [Var]?

Maybe also get rid of NoTp...

davidweichiang commented 1 year ago

Do you mean [] or [Var]? Yes, I’m in favor of merging as much as possible, but I usually program in Python…

On Wed, Jul 27, 2022 at 05:24 Chung-chieh Shan @.***> wrote:

Is it time to merge at least merge SProg and Prog into some type constructor that takes an argument that can be () or [Var]?

— Reply to this email directly, view it on GitHub https://github.com/diprism/compiler/pull/91#issuecomment-1196484164, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKVY2ICN6EDX7ZPZSOSO23VWD56JANCNFSM54V6ZOFQ . You are receiving this because you authored the thread.Message ID: @.***>

davidweichiang commented 1 year ago

On the subject of Prog, here's the table for Prog (x = identifier, tp = type, tm = term, p = parameter name, ptp = parameter type, cs = constructors, rtp = return type):

User-level Scheme-ified Elaborated
define UsProgFun x tp tm SProgFun x (Forall tgs ps tp) tm ProgFun x [(p, ptp)] tm rtp
extern UsProgExtern x tp SProgExtern x [ptp] rtp ProgExtern x [ptp] rtp
data UsProgData x ps cs SProgData x tgs ps cs ProgData x cs
ccshan commented 1 year ago

On the subject of Prog, here's the table for Prog

Thanks, the tables are helpful. I'm currently figuring out the ps in ctxtDeclType g y ps ctors...

davidweichiang commented 1 year ago

They are the type variable names -- was there something strange about how they are used?

davidweichiang commented 1 year ago

I think the ps never gets used. The one place where a Ctxt is created on an UsProgs is when alpha-renaming all local variables and type variables, just to ensure that the alpha-renaming doesn't collide with any global names.

davidweichiang commented 1 year ago

Here is my suggestion:

davidweichiang commented 1 year ago

I ended up making more changes than proposed above -- I'm reasonably happy with the way CtxtDef is now.

ccshan commented 1 year ago
* Merge DefData into DefSData, leaving tgs and ps empty when they are not needed.

Hi! I separately worked on using types to enforce "leaving tgs and ps empty when they are not needed." You can see it on the type-params branch now (5eb84c6047553bec8875bdbfe69b6e3e2d93bc77). I don't know that a textual merge would be helpful, but I wanted to discover and communicate how it would look to merge the representation types at different compiler stages, while still enforcing invariants. Overall I'd say it looks promising. We independently came upon splitting DefTerm into DefLocalTerm, DefGlobalTerm, DefCtorTerm. The enforced distinction between Type [] and Type None was occasionally interesting, for instance motivating this change:

diff --git a/src/Transform/Monomorphize.hs b/src/Transform/Monomorphize.hs
index 0200db9..f23af8a 100644
--- a/src/Transform/Monomorphize.hs
+++ b/src/Transform/Monomorphize.hs
@@ -180,7 +182,7 @@ makeInstantiations xis (SProgData y [] [] cs) =
     []
   else
     -- a datatype with no arguments can have only one instantiation, so we don't rename it (see renameCallsTp)
-    [ProgData y cs]
+    [ProgData y [Ctor x (map (renameCallsTp xis) tps) | Ctor x tps <- cs]]
 makeInstantiations xis (SProgData y tgs ps cs) =
     let tiss = Map.toList (xis Map.! y) in
     map (\ (tis, i) ->
davidweichiang commented 1 year ago

Oh, sorry if my changes conflicted with yours.

I had no idea that you could parameterize a type constructor -- is that what is happening in your branch?

davidweichiang commented 1 year ago

Your change at Monomorphize.hs:182 should be fixed in master regardless, right?

davidweichiang commented 1 year ago

Maybe DefCtor should store not a Type but something more specific... (or it should go away and lookupCtorType and lookupTermVar should look inside DefData)

What more specific information does DefCtor need? I think I like keeping it because one of the things Ctxt is used for is avoiding collisions of global names.

Maybe CtxtDef should have just two cases, one for term (which has 3 sub-cases) and one for type...

That is exactly how @HerbertMcSnout had it before, and I could certainly switch it back, basically by reverting https://github.com/diprism/compiler/pull/91/commits/9fd238c3a9ff084f3c02d8182be5924294834d68.

ccshan commented 1 year ago

To clarify:

Maybe DefCtor should store not a Type but something more specific... (or it should go away and lookupCtorType and lookupTermVar should look inside DefData)

What more specific information does DefCtor need? I think I like keeping it because one of the things Ctxt is used for is avoiding collisions of global names.

I meant the fact that splitArrows ctp always has the form (_, TpData y _).

Maybe CtxtDef should have just two cases, one for term (which has 3 sub-cases) and one for type...

That is exactly how @HerbertMcSnout had it before, and I could certainly switch it back, basically by reverting 9fd238c.

I'd still want the DefTerm case to have 3 sub-cases though:

data CtxtDef = DefTerm DefTerm | DefType DefType
data DefTerm =
    DefLocal Type
  | DefGlobal [Var] [Var] Type -- tags, params, type
  | DefCtor [Var] [Var] Type   -- tags, params, type
data DefType =
    DefData [Var] [Var] [Ctor] -- tags, params, ctors
  | DefSynonym [Var] [Var] Type -- tags, params, type
davidweichiang commented 1 year ago

I see, so you suggest DefTerm (DefLocal tp) instead of DefTerm ScopeLocal tp?

ccshan commented 1 year ago

I see, so you suggest DefTerm (DefLocal tp) instead of DefTerm ScopeLocal tp?

Yeah because then DefLocal and DefGlobal can come with different fields.

As usual, it's more verbose for lookupTermVar to be able to return something that could be DefLocal or DefGlobal but not DefData.

davidweichiang commented 1 year ago

OK, done. I think I like keeping DefCtor as it is, but I guess maybe I could be convinced to change it to DefCtor y i where y is the datatype's name and i is an integer?

ccshan commented 1 year ago

OK, done. I think I like keeping DefCtor as it is, but I guess maybe I could be convinced to change it to DefCtor y i where y is the datatype's name and i is an integer?

I'd rather not trade one error "This shouldn't happen" for another. All I meant to suggest was to defunctionalize the only place where DefCtor is constructed (in ctxtDefCtor), which amounts to replacing DefCtor tgs ps tp by DefCtor tgs ps y tps. Just a suggestion...