joews / peach

A functional programming language
MIT License
3 stars 0 forks source link

Add a second IR for typed AST Nodes #35

Closed joews closed 7 years ago

joews commented 7 years ago

In #34 I came across part of "the AST typing problem".

I want each compiler stage to map an input AST into a transformed AST. The only transforming stage at the moment is the type checker, which maps Raw AST Nodes into Typed AST Nodes. There could be others in the future.

That's difficult with static types, because we want to change the shape of each Node type. For example:

interface AstDefNode {
  type: string,
  name: string,
  value: AstNode
}

To add type information to this node, there are two changes: Add an exprType: Type - the actual type checker output for this node; and change the value to be a typed node as well:

interface TypedDefNode {
  type: string,
  name: string,
  exprType: Type,
  value: TypedNode
}

The question is: what is the right way to model Node types of different shapes in a statically typed language?

There seem to be three general solutions:

  1. Extend the above examples so that we have N IRs - a full set of Node types for the raw AST and typed AST. If we add another transforming stage, we'll add another set of Node types. This is simple and works well with the current codebase, but adds a lot of duplication and may not scale as the compiler grows.

  2. Add optional fields to the base AstNode interface so that all of the possible transformations may be performed on a single type (I think GWT does this). This works well with the current codebase but loses type safety: a stage before the type checker could try to access node.exprType, which is a null reference.

  3. Store transformation data in external maps (OCaml does this). The type checker would return something like a WeakMap<AstNode, Type>, which later stages can use to get rich data. All stages share the same AST. This is neat, but would involve quite a lot of refactoring.

This PR implements 1. It's pragmatic - there are only 2 IRs, and I don't have plans to add any other stages yet, and it doesn't need much refactoring. If it causes maintainability issues i'll look again at 3.