miking-lang / miking

Miking - the meta viking: a meta-language system for creating embedded languages
Other
51 stars 31 forks source link

Refactoring `inexpr` #826

Open elegios opened 6 months ago

elegios commented 6 months ago

We have a decent amount of code needing to traverse Exprs with inexpr fields differently (often to traverse the "spine" of a program), as well as a good amount of overlap between Expr and Decl of the mlang ast. I think we could address both of these in one go.

Core idea: introduce a new term TmIn representing anything with an inexpr:

TmIn {decl : Decl, inexpr : Expr, ...}

At this point everything that currently has an inexpr can be replaced by usage of this term and a Decl. TmIn is essentially a kind of sequencing construct, for putting something with a side-effect before an expression, for a very broad definition of side-effect:

Adding new terms of this kind would not make spine traversal incorrect, because TmIn would already be covered by such code. I also strongly suspect that there are points in the compiler (or one of the satellite projects) that miss at least one case here, because it would likely produce fairly subtle errors most of the time.

This should mean that we can share more code between mexpr and mlang, because we can presumably use exactly the same code to handle a Decl and then only handle the sequencing in TmIn.


Drawbacks:

johnwikman commented 6 months ago

I think that this looks really nice! The name is fine by me, but an alternative could also be TmDecl to more reflect that we are declaring something in an expression.

W.r.t. smap, it feels more prone to human error since one may miss to include a check for contained declarations, unlike now when it is expressions all the way down. I guess you might want something like an smap_Expr_DeclExpr which applies a function on both expressions and declarations.

Since the goal of this is to be able to share more code between mlang and mexpr, an alternative could be to instead embed Expr's in Decl's, such as DeclBind {bindexpr : Expr, ...} where you are binding an expressing on a top-level instead of into the next expression. This probably works better with the smap case, at the cost of having "dead nodes" in the AST which may cause confusion and bugs.

elegios commented 6 months ago

It's not necessarily strictly declaring something, it would include utest as well, though I suppose that's already included in Decl, so TmDecl is probably a better name.

marten-voorberg commented 6 months ago

We should also include TmUse since TmUse in mlang/ast.mc also follows the inexpr pattern.

marten-voorberg commented 5 months ago

This change would likely also make compilation from MLang to MExpr more elegant. In the current situation where we have contructors such as TmLet and DeclLet in MExpr and MLang respectively, the compilation involves a manual, error-prone translation from one into the other as seen below:

  sem compileDecl : CompilationContext -> Decl -> CompilationResult
  sem compileDecl ctx = 
  | DeclLet d -> _ok (
    withExpr ctx (TmLet {ident = d.ident,
                         tyAnnot = d.tyAnnot,
                         tyBody = d.tyBody,
                         body = d.body,
                         info = d.info,
                         ty = tyunknown_,
                         inexpr = uunit_}))
  | DeclRecLets d -> _ok (
    withExpr ctx (TmRecLets {bindings = d.bindings,
                             inexpr = uunit_,
                             ty = tyunknown_,
                             info = d.info}))
-- And more cases to follow...