Currently we have a few places where we do much more typechecking than is strictly necessary. For example when we edit a type signature, we recheck the entire program (since any definition may refer to the edited one, and thus need rechecking). We could avoid much of this work if we only re-checked those which actually depended on it. A similar thing happens when editing the body of a definition: for example, changing f (λx. g x) t to f (λx. g x) ? we recheck the whole definition and do not take into consideration that most of the branches have not changed.
We should avoid redoing work.
Some ideas
Look at pre-existing systems:
Most compilers do this at a coarse-grained level. GHC has some commentary at https://gitlab.haskell.org/ghc/ghc/-/wikis/commentary/compiler/recompilation-avoidance. This is often a source-file or module level, and sometimes goes by the name of "separate compilation" (i.e. can compile each source file to a separate object file, and link afterwards; when code changes, only the corresponding objects need to be re-compiled (and maybe some reverse dependencies)).
This is the raison d'être of build systems (eg make, shake) and language servers tend to do the same thing at a finer-grained level (e.g. the work on hls/ghcide using reflex/frp instead of shake)
The two main approaches I can see for this are
caching: just memoize the typechecker. This would benefit from being keyed by some stable key which does not depend on names, so we don't get cache misses just from renaming a module etc. It would also benefit from being shared across different sessions, and perhaps persisted so we do not have to recheck everything whenever the backend gets restarted (e.g. deploying a new version). There are obviously all the normal problems of keeping caches consistent and a reasonable size.
dependency analysis: we could may attention to what an action has modified, and do some initial pass on the ASTs to tell what depends on the changed portions (i.e. what prior typechecking decisions have to be revisited). We could use this to only TC the relevant definitions. This analysis would probably benefit from itself being cached, which again has problems of consistency. This approach is roughly "maintain a build graph / dependency graph to traverse"
Currently we have a few places where we do much more typechecking than is strictly necessary. For example when we edit a type signature, we recheck the entire program (since any definition may refer to the edited one, and thus need rechecking). We could avoid much of this work if we only re-checked those which actually depended on it. A similar thing happens when editing the body of a definition: for example, changing
f (λx. g x) t
tof (λx. g x) ?
we recheck the whole definition and do not take into consideration that most of the branches have not changed.We should avoid redoing work.
Some ideas
make
,shake
) and language servers tend to do the same thing at a finer-grained level (e.g. the work onhls
/ghcide
usingreflex
/frp instead ofshake
)