dimitriv / Ziria

A domain-specific-language and compiler for low-level bitstream processing.
92 stars 18 forks source link

Remove (most) type annotations from the AST, create unification variables only in TC #66

Open edsko opened 9 years ago

edsko commented 9 years ago

Currently, the AST carries type annotations everywhere. This makes it very inconvenient to write AST transformations. Instead, the AST should carry just enough type annotations so that the type of any expression can easily be computed, but no more.

The current design has some ramifications. Since it's so painful to have types everywhere, what happens is the following:

  1. We type check the user's program
  2. We apply transformations, and return an untyped program
  3. We run the type checker again and panic if it reports an error.

This is somewhat unsatisfactory, and has a knock on effect on the rest of the compiler. In particular, the translation from SrcTy to Ty currently happens in the renamer. However, it should really be done in the type checker because if we have something like

let x = [1,2,3] 
var y : arr[length x] int = ...

then the renamer must bail out with an error because when it finds the type of x it doesn't know that it's an array, whereas the type checker could apply unification at this point. However, if the type checker does the translation from SrcTy to Ty then we cannot call the type checker multiple times, as we currently do.

So once we fix the AST (this will however be a major change which will affect all parts of the compiler) we can then move the translation from SrcTy to Ty to the type checker and avoid calling the type checker multiple types.

dimitriv commented 9 years ago

This is not true of every transformation but its certainly true of vectorization, which currently produces untyped terms. We should instead do what Edsko suggests indeed.

edsko commented 9 years ago

Actually, taking a look at this (for a different reason, actually), there are only two explicit ty occurrences in the AST currently; one in the expression language

  ELetRef :: Name -> Ty -> Maybe (Exp a) -> Exp a -> Exp0 a

and one in the the compututation language

  LetERef :: Name -> Ty -> Maybe (Exp b) -> Comp a b -> Comp0 a b

both of these are indeed redundant. This redundancy comes from declParser

declParser :: BlinkParser (GName SrcTy, SrcTy, Maybe SrcExp)
declParser =
    withPos mkDecl <* reserved "var" <*> identifier
                   <* colon <*> parseBaseType
                   <*> optionMaybe (symbol ":=" *> parseExpr)
  where
    mkDecl p () x ty mbinit = (toName x p (Just ty), ty, mbinit)

Note that mkDecl in fact returns the type twice: as part of the name, and as a separate component of the pair.

So the main place where we carry types is in Names, which we probably want to continue to do. @dimitriv would removing the above two type annotations really make all the difference? Or am I missing something?

edsko commented 9 years ago

Actually, the same redundancy occurs in function arguments

paramsParser :: BlinkParser [(GName SrcTy, SrcTy)]
paramsParser = parens $ sepBy paramParser (symbol ",")
  where
    paramParser = withPos mkParam <*> identifier <* colon <*> parseBaseType
    mkParam p () x ty = (toName x p (Just ty), ty)

where we record the type again on the name and again explicitly. Strangely we don't do this for parameters to comp functions:

compParamsParser :: BlinkParser [(GName SrcTy, CallArg SrcTy (GCTy0 SrcTy))]
compParamsParser = parens $ sepBy paramParser (symbol ",")
  where
    paramParser = withPos mkParam <*> identifier <* colon <*> parseType
    parseType   = choice [ CAExp  <$> parseBaseType
                         , CAComp <$> parseCompBaseType
                         ] <?> "computation parameter type"

    mkParam p () x mty = (toName x p Nothing, mty)

This seems suspicious; @dimitriv is this a bug or is there a reason why we don't record the type on the parameter name here?

edsko commented 9 years ago

(And it occurs again in the list of locals for Prog and functions, but they will soon be gone.)

edsko commented 9 years ago

So here's the reason I'm looking at this now; I've just added the following comment to the renamer:

NOTE [Redundant types]

The AST contains redundant types in a number of places:

In all these cases the source language contains a single type annotation, say

var x : int

but the AST contains the type twice: once as part of the name (x) as once as an explicit type in the AST. This is a problem for the renamer in the case of implicit variables; for instance, when the user writes

var x : arr int

then this "arr int" (with unspecified length) expression occurs, again, twice in the AST. Then when the renamer passes over the tree and starts to introduce type variables, the result will be something like

LetERef (Name "x_1" <arr[_r3] int>) <arr[_r4] int> ...

because the renamer sees two array types with an unspecified size and does not know that they are related.

There are two possible solutions to this. One is to modify the type checker to unify the two types in the type checking phase. This might make sense because it means that if we later introduce incompatible types in these places in the AST we would find it when we type check again. However, since we are planning to remove this redundancy from the AST anyway, for now we've simply ignored the redundant types and instead we do something like for function parameters:

aux (parNm, _parTy) = do
  parNm' <- renameBoundName parNm
  -- parTy' <- renameType parTy -- NOTE [Redundant types]
  let parTy' = fromJust (mbtype parNm')
  return ((parNm, parNm'), (parNm', parTy'))

and likewise for the other cases.

NOTE: For some strange reason we do NOT record the type on the name of comp function parameters, so we don't do this trick in renameCParams.

edsko commented 9 years ago

The real problem is not the explicit occurrences of types in the AST, but rather the use of the labels for types, because this introduces types at every node in the AST. See #69.