Suggested cleanups - Githubissues

davidweichiang commented 1 year ago

None of these suggestions changes the behavior of the compiler, but might simplify the code:

[x] #84
[x] Currently a datatype's type parameters are stored in the Env or Ctxt, while a define's type parameters are stored in Scheme/Forall. Can we treat them both the same way? (Not sure which is better.)
[x] #90
[ ] There's a lot of duplicate code between UsProgs, SProgs, and Progs. Would it be possible to merge them?

davidweichiang commented 1 year ago

[x] Sort of related to the first item, subst and freeVars seem to have a slightly inconsistent definition of variable:
- In UsVar x, x is a variable even if global
- In TmVarG x, x is not a variable
- In TpVar x, x is a (type) variable even if global (a datatype name)

However, if either of the latter two is changed, something breaks.

davidweichiang commented 1 year ago

[x] Both newVar and freshVar seem to have similar purposes, but I'm not sure if they can be merged.

davidweichiang commented 1 year ago

Would it make sense to do something like data Var = TermVar String | TypeVar String | TagVar Int so that (1) in place of tgs ps (tag variables and type variables) getting passed around, it can just be a single list and (2) IsTag = Bool and SolveVars = Map Var IsTag could be eliminated?

ccshan commented 1 year ago

Would it make sense to do something like data Var = TermVar String | TypeVar String | TagVar Int so that (1) in place of tgs ps (tag variables and type variables) getting passed around, it can just be a single list and (2) IsTag = Bool and SolveVars = Map Var IsTag could be eliminated?

I like such separation. But I also want to not be able to represent tgs ps where the TagVars and TypeVars are in shuffled order. Maybe it's too verbose but I could imagine distinguishing TermVar and TypeVar and TagVar as separate types (newtypes, at least initially) and so there would still be two lists, one of TagVars and one of TypeVars, but we would still get rid of IsTag.

In general instead of using Strings as names, we can put more structure into names. For example we can store an Int in each name and rename them apart just by incrementing the Int; rendering those Ints as digits or primes or whatever can wait until Show time. Similarly for the output of monomorphization: we can just puts the type arguments as part of the name, without converting them into inst numbers; rendering those big names can again wait until Show time.

davidweichiang commented 1 year ago

In theory, shuffling could already happen (but never does) because in TpData y ps, ps concats the tag and type variables, and subst allows y to be substituted with another TpData y' ps', giving TpData y' (ps' ++ ps).

I like the idea of more structured names. That sounds like a big undertaking though.

ccshan commented 1 year ago

Come to think of it, if DefCtor and DefData have tags and params as two separate [Var]s (which makes sense), then TpData should have tags and params as two separate lists as well ([Var] and [Type] I guess).

davidweichiang commented 1 year ago

Copying tables from #91 here:

	User-level	Scheme-ified	Elaborated
define	UsProgFun x tp tm	SProgFun x (Forall tgs ps tp) tm	ProgFun x [(p, ptp)] tm rtp
extern	UsProgExtern x tp	SProgExtern x [ptp] rtp	ProgExtern x [ptp] rtp
data	UsProgData x ps cs	SProgData x tgs ps cs	ProgData x cs

	User-level	Scheme-ified	Elaborated
local var	DefTerm (DefLocal tp)	DefTerm (DefLocal tp)	DefTerm (DefLocal tp)
define	DefTerm (DefGlobal [] [] tp)	DefTerm (DefGlobal tgs ps tp)	DefTerm DefGlobal [] [] tp
extern	DefTerm (DefGlobal [] [] tp)	DefTerm (DefGlobal tgs* [] tp)	DefTerm DefGlobal [] [] tp
constructor	DefTerm (DefCtor [] [] tp)	DefTerm (DefCtor tgs ps tp)	DefTerm (DefCtor [] [] tp)
datatype	DefType (DefData [] ps cs)	DefType (DefData tgs ps cs)	DefType (DefData [] [] cs)
type var	**	**	**

*An extern can only have tags if it has a type parameter that is ignored. In practice I think we just store [] for the tags. **We currently just don't store type variables in Ctxt.

davidweichiang commented 1 year ago

@ccshan Are you still interested in pursuing https://github.com/diprism/perpl/tree/type-params? Do you want to make it into a PR so it can be discussed?

ccshan commented 1 year ago

@ccshan Are you still interested in pursuing https://github.com/diprism/perpl/tree/type-params? Do you want to make it into a PR so it can be discussed?

Yes, and I can probably do it faster and easier the second time, but it seems like we should avoid having multiple PRs that each make many changes textually speaking?

davidweichiang commented 1 year ago

Here's what Prog and Ctxt currently look like:

x = identifier, tp = type, tm = term, p = parameter name, ptp = parameter type, a = argument, atp = argument type, cs = constructors, rtp = return type, tpps = type parameters, tis = type instances (?), tgs = tags

	User-level	Scheme-ified	Monomorphized	Argified
define	UsProgFun x tm tp	SProgFun x tgs tpps tm tp	ProgFun x [] tm rtp	ProgFun x [(p, ptp)] tm rtp
extern	UsProgExtern x tp	SProgExtern x rtp	ProgExtern x [] rtp	ProgExtern x [ptp] rtp
data	UsProgData x tpps cs	SProgData x tgs tpps cs	ProgData x cs	ProgData x cs

	User-level	Scheme-ified	Monomorphized
local var	CtTerm (CtLocal tp)	CtTerm (CtLocal tp)	CtTerm (CtLocal tp)
define	CtTerm (CtDefine [] [] tp)	CtTerm (CtDefine tgs tpps tp)	CtTerm (CtDefine [] [] tp)
extern	CtTerm (CtExtern tp)	CtTerm (CtExtern tp)	CtTerm (CtExtern tp)
constructor	CtTerm (CtCtor tpps [] tp)	CtTerm (CtCtor tgs tpps tp)	CtTerm (CtCtor [] [] tp)
datatype	CtType (CtData tpps [] cs)	CtType (CtData tgs tpps cs)	CtType (CtData [] cs)
type var	**	**	**

**We currently just don't store type variables in Ctxt.

davidweichiang commented 1 year ago

Variable usages:

	User-Level	Scheme-ified	Monomorphized	Argified
local var	*UsVar x	*TmVarL x tp	*TmVarL x tp	*TmVarL x tp
define	*UsVar x	TmVarG GlDefine x tgs tis [] tp	TmVarG GlDefine x [] [] [] tp	TmVarG GlDefine x [] [] [(a, atp)] tp
extern	*UsVar x	TmVarG GlExtern x [] [] [] tp	TmVarG GlExtern x [] [] [] tp	TmVarG GlExtern x [] [] [(a, atp)] tp
constructor	*UsVar x	TmVarG GlCtor x tgs tis [] tp	TmVarG GlCtor x [] [] [] tp	TmVarG GlCtor x [] [] [(a, atp)] tp
datatype	TpData x [] tis	TpData x tgs tis	TpData x [] []	TpData x [] []
type var	*TpVar x	*TpVar x	--	--

substitutable by subst and considered free by freeVars

davidweichiang commented 1 year ago

I'm editing the above charts to keep them up to date, instead of making new ones.

davidweichiang commented 1 year ago

[ ] Would it be possible/advisable to merge ProgFun/ProgExtern with DefTerm, ProgData with DefType? That is, a Progs would basically be a Ctxt together with an end Term?

ccshan commented 1 year ago

Would it be possible/advisable to merge ProgFun/ProgExtern with DefTerm, ProgData with DefType? That is, a Progs would basically be a Ctxt together with an end Term?

That would be nice but I think the merger should happen in the opposite direction: Instead of worrying about the consistency among the DefCtors and DefDatas of a Ctxt when transformed, we would have type Ctxt = [Prog], and ctxtLookupTerm would look through all 3 kinds of Progs because the given Var might be a constructor name. But I understand that ctxtDefLocal is not representable as a Prog, and looking up a Var in the Map that is Ctxt is a little faster than in the list that is [Prog], so maybe a merger is not so simple.

davidweichiang commented 1 year ago

[ ] It would be nice to eliminate NoTp.

diprism / perpl

Suggested cleanups #79