overlookmotel commented 2 weeks ago

Various linter rules are implemented as a Rule::run_once method which iterates over nodes or symbols.

Usually this is because they work in 2 passes. The first pass collects info from the AST/symbols and then the 2nd processes that info.

This results in nodes or symbols being unnecessarily iterated over multiple times, when run and run_on_symbol methods are already provided for doing this.

I imagine it'd be faster to:

Add a Rule::run_once_at_end method which executes after run and run_on_symbol have executed on all nodes/symbols.
Make Rule::run and Rule::run_on_symbol take a &mut self instead of &self.
Reimplement these rules:
- To store their intermediate data in self.
- Do the main loop over nodes or symbols as a run / run_on_symbol method.
- Do the 2nd pass in run_once_at_end.

It's iterating over nodes repeatedly which would likely have the largest perf impact. But there are so many rules which iterate over symbols in run_once that it might also be having a noticeable impact in aggregate.

Which rules?

Rules which use run_once in this way are:

Iterating over `nodes`

eslint/func_names
eslint/no_this_before_super
import/no_named_as_default_member
jest/no_large_snapshots
jest/prefer_hooks_in_order
jsdoc/require_returns
tree_shaking/no_side_effects_in_initialization (which does a partial tree traversal)

Iterating over `symbols`

eslint/no_shadow_restricted_name
jest/consistent_test_it
jest/expect_expect
jest/max_expects
jest/max_nested_describes
jest/no_alias_methods
jest/no_conditional_expect
jest/no_confusing_set_timeout
jest/no_disabled_tests
jest/no_done_callback
Loads more jest rules + vitest rules via collect_possible_jest_call_node

@DonIsaac What do you think?

DonIsaac commented 2 weeks ago

I like where this is going. I need to think on this for a bit before I give an answer.

Boshen commented 2 weeks ago

This is not a priority, none of the rules are enabled by default.

overlookmotel commented 2 weeks ago

5201 was an example of fixing a rule which was unnecessarily iterating over symbols.

Boshen commented 1 week ago

cc @mysteryven if you're intrested.

shulaoda commented 2 days ago

Is there any progress? I think this is a good idea. 🤔

Boshen commented 2 days ago

Is there any progress? I think this is a good idea. 🤔

I marked this as good first issue since no one is working on this 😅

shulaoda commented 2 days ago

Let me implement it. 👀

overlookmotel commented 2 days ago

Thank you @shulaoda! That'd be great.

The ones which iterate over nodes should be first priority, as they'll be the most expensive.

Please keep PRs small so we can review them more easily. 1 PR for the changes to Rule (run_once_at_end etc). And then please update each rule in a separate PR.

Updating Rule methods to use a mutable &mut self may be more complicated than it sounds, because need a nice API for setting up internal state. Feel free to ask for advice if you run into any snags.

Boshen commented 2 days ago

Make Rule::run and Rule::run_on_symbol take a &mut self instead of &self.

This is not going to work. The rules are singletons, immutable by design and cannot contain state. They are shared across threads for all files run in parallel.

If we really need state, we need to add the states to context, something like FxHashMap<Rule, Box<dyn RuleState>>

overlookmotel commented 1 day ago

Ah I didn't realize that rules were shared across threads.

As far as I understand, each rule gets its own LinterCtx object. Are they also shared across threads?

oxc-project / oxc

Avoid multiple iterations over `nodes` and `symbols` in linter #5204

Which rules?

Iterating over `nodes`

Iterating over `symbols`

5201 was an example of fixing a rule which was unnecessarily iterating over symbols.

oxc-project / oxc

Avoid multiple iterations over `nodes` and `symbols` in linter #5204

Which rules?

Iterating over nodes

Iterating over symbols

5201 was an example of fixing a rule which was unnecessarily iterating over symbols.

Iterating over `nodes`

Iterating over `symbols`