Redundant typing during desugaring

jstolarek commented 5 years ago

The main fragment of our frontend desugaring pipeline looks like this:

((( ExperimentalExtensions.check#program
 ->- before_typing_ext session_exceptions DesugarSessionExceptions.wrap_linear_handlers
 ->- DesugarHandlers.desugar_handlers_early#program
 ->- DesugarLAttributes.desugar_lattributes#program
 ->- RefineBindings.refine_bindings#program
 ->- DesugarDatatypes.program tyenv.Types.tycon_env
 ->- TypeSugar.Check.program tyenv
 ->- after_typing ((DesugarCP.desugar_cp tyenv)#program ->- snd3)
 ->- after_typing ((DesugarInners.desugar_inners tyenv)#program ->- snd3)
 ->- after_typing_ext session_exceptions ((DesugarSessionExceptions.insert_toplevel_handlers tyenv)#program ->- snd3)
 ->- after_typing_ext session_exceptions ((DesugarSessionExceptions.desugar_session_exceptions tyenv)#program ->- snd3)
 ->- after_typing ((DesugarProcesses.desugar_processes tyenv)#program ->- snd3)
 ->- after_typing ((DesugarDbs.desugar_dbs tyenv)#program ->- snd3)
 ->- after_typing ((DesugarFors.desugar_fors tyenv)#program ->- snd3)
 ->- after_typing ((DesugarRegexes.desugar_regexes tyenv)#program ->- snd3)
 ->- after_typing ((DesugarFormlets.desugar_formlets tyenv)#program ->- snd3)
 ->- after_typing ((DesugarPages.desugar_pages tyenv)#program ->- snd3)
 ->- after_typing ((DesugarFuns.desugar_funs tyenv)#program ->- snd3))
  program), ffi_files)

Here @dhil described this fragment of the pipeline as:

(...)

Early source transformations on the AST [Sugartypes -> Sugartypes]

Typechecking the AST [Sugartypes type_env -> Sugartypes type * type_env]

Typability preserving late source transformations [Sugartypes type type_env -> Sugartypes type type_env]

(...)

Furthermore the transformations performed in the second phase are untyped, whilst the transformations performed in the fourth phase are (intended) to preserve typability, in particular, such transformations are obliged to return the type of the transformed expression which may.

In the code above TypeSugar.Check.program tyenv is phase 3, everything after it is phase 4. Now, if I am reading the combinators correctly then transformations in the fourth stage are not exactly type-preserving transformations. They would be if they took an AST with types and produce a transformed AST with transformed types. But this is not what seems to be happening here. What we do instead is discard type information produced by every step (snd3 combinator) so that each subsequent step starts with an untyped tree.

Am I reading this correctly? If yes, what is the point of doing so? It seems like we are doing unnecessary AST traversals and then discarding the results. Wouldn't it be better not to produce all the information that gets discarded anyway and then just have one typing pass at the end?

dhil commented 5 years ago

I think the transforms appearing at after_typing are instances of TransformSugar. Those transforms perform type reconstruction during the traversal of the AST -- this is similar to the IR transforms.

jstolarek commented 5 years ago

I think the transforms appearing at after_typing are instances of TransformSugar.

Yes, that is exactly the case.

jstolarek commented 5 years ago

Ok, I see what's going on and why we can't simply implement these typed desugarings as SugarTraversals.map objects instead of TransformSugar.transform. Firstly, the Sugartypes AST contains some type annotations that get filled in during desugaring. These annotations don't get discarded when we throw away the type generated by transform. A good example is the cp_phrasenode.Offer constructor:

| Offer of (binder * (string * cp_phrase) list)

that gets desugared into phrasenode.Offer:

| `Offer of phrase * (pattern * phrase) list * Types.datatype option

that contains additional Types.datatype option field that is filled in by DesugarCP. Secondly, we generate some typed primitives and in order to do that we need access to type information stored inside transform objects, that is unavailable in map objects. Example, again from DesugarCP:

| Link ({node=c, Some ct; _}, {node=d, Some _; _}) ->
   o, fn_appl_node link_sync_str [`Type ct; `Row o#lookup_effects]
                   [var c; var d],
   Types.make_endbang_type

Here o#lookup_effects is a type information that is only available in TransformSugar.transform, but not SugarTraversals.map.

I still think that repeatedly generating things that get discarded is not a good design (at least not in a strict language), but now I see that changing this requires some deeper redesign and not just a simple refactoring. Perhaps we could have only untyped desugaring passes and then at the end just one pass that fills in all the missing type information.

slindley commented 5 years ago

Building type reconstruction into traversals in order that the types of intermediate terms can be used as part of those traversals is a standard technique (arguably the canonical technique for Church-style languages). If, for instance, only binders are annotated with types, then something like this is inevitable.

Other approaches are possible. For instance one can attempt to annotate every single subterm with its type.

Nowadays, my preferred approach would probably be to adopt a bidirectional type system. This still requires some type reconstruction but fewer redundant annotations.

As you rightly acknowledge, changing to a different approach would require a non-trivial redesign.

jamescheney commented 4 years ago

I think this issue captures discussion relevant to any future work on type-preserving desugaring, as discussed in other more recent issues #865, #882, #732, #636 but I don't think this issue identifies a specific defect that is yet to be resolved, so I'm closing. If we think it should be a goal to redesign the desugaring pipeline to be type-preserving rather than rerunning type inference after each pass then let's create a separate issue/proposal for that.

links-lang / links

Redundant typing during desugaring #489