diprism / perpl

The PERPL Compiler
MIT License
10 stars 5 forks source link

Suggested cleanups #79

Open davidweichiang opened 1 year ago

davidweichiang commented 1 year ago

None of these suggestions changes the behavior of the compiler, but might simplify the code:

davidweichiang commented 1 year ago

However, if either of the latter two is changed, something breaks.

davidweichiang commented 1 year ago
davidweichiang commented 1 year ago

Would it make sense to do something like data Var = TermVar String | TypeVar String | TagVar Int so that (1) in place of tgs ps (tag variables and type variables) getting passed around, it can just be a single list and (2) IsTag = Bool and SolveVars = Map Var IsTag could be eliminated?

ccshan commented 1 year ago

Would it make sense to do something like data Var = TermVar String | TypeVar String | TagVar Int so that (1) in place of tgs ps (tag variables and type variables) getting passed around, it can just be a single list and (2) IsTag = Bool and SolveVars = Map Var IsTag could be eliminated?

I like such separation. But I also want to not be able to represent tgs ps where the TagVars and TypeVars are in shuffled order. Maybe it's too verbose but I could imagine distinguishing TermVar and TypeVar and TagVar as separate types (newtypes, at least initially) and so there would still be two lists, one of TagVars and one of TypeVars, but we would still get rid of IsTag.

In general instead of using Strings as names, we can put more structure into names. For example we can store an Int in each name and rename them apart just by incrementing the Int; rendering those Ints as digits or primes or whatever can wait until Show time. Similarly for the output of monomorphization: we can just puts the type arguments as part of the name, without converting them into inst numbers; rendering those big names can again wait until Show time.

davidweichiang commented 1 year ago

In theory, shuffling could already happen (but never does) because in TpData y ps, ps concats the tag and type variables, and subst allows y to be substituted with another TpData y' ps', giving TpData y' (ps' ++ ps).

I like the idea of more structured names. That sounds like a big undertaking though.

ccshan commented 1 year ago

Come to think of it, if DefCtor and DefData have tags and params as two separate [Var]s (which makes sense), then TpData should have tags and params as two separate lists as well ([Var] and [Type] I guess).

davidweichiang commented 1 year ago

Copying tables from #91 here:

User-level Scheme-ified Elaborated
define UsProgFun x tp tm SProgFun x (Forall tgs ps tp) tm ProgFun x [(p, ptp)] tm rtp
extern UsProgExtern x tp SProgExtern x [ptp] rtp ProgExtern x [ptp] rtp
data UsProgData x ps cs SProgData x tgs ps cs ProgData x cs
User-level Scheme-ified Elaborated
local var DefTerm (DefLocal tp) DefTerm (DefLocal tp) DefTerm (DefLocal tp)
define DefTerm (DefGlobal [] [] tp) DefTerm (DefGlobal tgs ps tp) DefTerm DefGlobal [] [] tp
extern DefTerm (DefGlobal [] [] tp) DefTerm (DefGlobal tgs* [] tp) DefTerm DefGlobal [] [] tp
constructor DefTerm (DefCtor [] [] tp) DefTerm (DefCtor tgs ps tp) DefTerm (DefCtor [] [] tp)
datatype DefType (DefData [] ps cs) DefType (DefData tgs ps cs) DefType (DefData [] [] cs)
type var ** ** **

*An extern can only have tags if it has a type parameter that is ignored. In practice I think we just store [] for the tags. **We currently just don't store type variables in Ctxt.

davidweichiang commented 1 year ago

@ccshan Are you still interested in pursuing https://github.com/diprism/perpl/tree/type-params? Do you want to make it into a PR so it can be discussed?

ccshan commented 1 year ago

@ccshan Are you still interested in pursuing https://github.com/diprism/perpl/tree/type-params? Do you want to make it into a PR so it can be discussed?

Yes, and I can probably do it faster and easier the second time, but it seems like we should avoid having multiple PRs that each make many changes textually speaking?

davidweichiang commented 1 year ago

Here's what Prog and Ctxt currently look like:

x = identifier, tp = type, tm = term, p = parameter name, ptp = parameter type, a = argument, atp = argument type, cs = constructors, rtp = return type, tpps = type parameters, tis = type instances (?), tgs = tags

User-level Scheme-ified Monomorphized Argified
define UsProgFun x tm tp SProgFun x tgs tpps tm tp ProgFun x [] tm rtp ProgFun x [(p, ptp)] tm rtp
extern UsProgExtern x tp SProgExtern x rtp ProgExtern x [] rtp ProgExtern x [ptp] rtp
data UsProgData x tpps cs SProgData x tgs tpps cs ProgData x cs ProgData x cs
User-level Scheme-ified Monomorphized
local var CtTerm (CtLocal tp) CtTerm (CtLocal tp) CtTerm (CtLocal tp)
define CtTerm (CtDefine [] [] tp) CtTerm (CtDefine tgs tpps tp) CtTerm (CtDefine [] [] tp)
extern CtTerm (CtExtern tp) CtTerm (CtExtern tp) CtTerm (CtExtern tp)
constructor CtTerm (CtCtor tpps [] tp) CtTerm (CtCtor tgs tpps tp) CtTerm (CtCtor [] [] tp)
datatype CtType (CtData tpps [] cs) CtType (CtData tgs tpps cs) CtType (CtData [] cs)
type var ** ** **

**We currently just don't store type variables in Ctxt.

davidweichiang commented 1 year ago

Variable usages:

User-Level Scheme-ified Monomorphized Argified
local var *UsVar x *TmVarL x tp *TmVarL x tp *TmVarL x tp
define *UsVar x TmVarG GlDefine x tgs tis [] tp TmVarG GlDefine x [] [] [] tp TmVarG GlDefine x [] [] [(a, atp)] tp
extern *UsVar x TmVarG GlExtern x [] [] [] tp TmVarG GlExtern x [] [] [] tp TmVarG GlExtern x [] [] [(a, atp)] tp
constructor *UsVar x TmVarG GlCtor x tgs tis [] tp TmVarG GlCtor x [] [] [] tp TmVarG GlCtor x [] [] [(a, atp)] tp
datatype TpData x [] tis TpData x tgs tis TpData x [] [] TpData x [] []
type var *TpVar x *TpVar x -- --
davidweichiang commented 1 year ago

I'm editing the above charts to keep them up to date, instead of making new ones.

davidweichiang commented 1 year ago
ccshan commented 1 year ago

Would it be possible/advisable to merge ProgFun/ProgExtern with DefTerm, ProgData with DefType? That is, a Progs would basically be a Ctxt together with an end Term?

That would be nice but I think the merger should happen in the opposite direction: Instead of worrying about the consistency among the DefCtors and DefDatas of a Ctxt when transformed, we would have type Ctxt = [Prog], and ctxtLookupTerm would look through all 3 kinds of Progs because the given Var might be a constructor name. But I understand that ctxtDefLocal is not representable as a Prog, and looking up a Var in the Map that is Ctxt is a little faster than in the list that is [Prog], so maybe a merger is not so simple.

davidweichiang commented 1 year ago