jfmengels / elm-review

Analyzes Elm projects, to help find mistakes before your users find them.
https://package.elm-lang.org/packages/jfmengels/elm-review/latest/
Other
252 stars 13 forks source link

Change the visit order: Visit one file with multiple rules #153

Closed jfmengels closed 1 year ago

jfmengels commented 1 year ago

This trying a rewrite of elm-review's internals. The idea is to change how rules and modules are visited.

Before, we (very approximatively) did a errors = List.concatMap (runReview project.modules) allRules, meaning we ran one rule on the whole project, then the next one, and so on.

With this change (which I'm hoping would be more efficient, but requires polishing and benchmarking), is to instead do errors = List.concatMap (\module_ -> runReview module_ allRules) modules), meaning we analyze one module with all the rules, then we analyze the next module, and so on.

I'm thinking that that would reduce the overall number of operations. For instance, we wouldn't need to compute how to traverse the AST (e.g. "what are the children of this expression?") once for each rule, but instead only once. This was inspired by the performance benchmarks of the new Python linter named Ruff, which chose this visit order and that seems to work very well.

I also think it could help simplify the computations that the framework does, like the ModuleNameLookupTable (and one day type inference, and maybe even more), as these could be computed once as long as one rule needs it, instead of the current approach which is more like: checking whether the rule needs it (if not skip it), checking if it's already in some cross-rule cache (if so use it) , computing it and storing it in the cross-rule cache. I'm still unsure as to whether this will change anything.

There are also of course tradeoffs, for instance rules like NoUnused.Variables will trigger a lot of fixes, and for each fix we will run all rules, not only NoUnused.Variables, which might be more wasteful.

It does get rid of the need to sort rules by likelyhood of fixing things, and doesn't require a rule to say it will fix things. I'm not using that information though, so that's for later.

Also, the implementation was daunting because I needed to store multiple rules in a List, and they all have different types for their context. I had solved the problem already before (using the ruleImplementation that recreated a Rule without type variable), but for this change I needed to do the same at a lot lower level. I used the technique now known as "Jeremy's interfaces" (which is in essence the same approach I was using before, but more ). It's a bit hard to wrap your head around it, but I think I'm mostly there. There are more hurdles to overcome, but so far it looks somewhat promising. Thanks to Jeremy, and to @pithub for his very helpful thread on the subject.