rubberduck-vba / Rubberduck

Every programmer needs a rubberduck. COM add-in for the VBA & VB6 IDE (VBE).
https://rubberduckvba.com
GNU General Public License v3.0
1.91k stars 299 forks source link

Faster Code Explorer in bigger VBA projects #5029

Open ghost opened 5 years ago

ghost commented 5 years ago

Justification I use Rubberduck in a bigger ms access project and each refresh takes above 1 minute.

Description It would be great if the refresh period could be shorter.

Additional context I am currently working in Office 2010 environment with i7-5600U @ 2.60GHz and 12 GB RAM.

Short Project Overview

retailcoder commented 5 years ago

I'd be curious to see the trace-level logs for the initial parse and a subsequent refresh.

Some code constructs are inherently ambiguous and break the parser's SLL prediction mode; when that happens the entire module is reprocessed using the slower but more accurate LL prediction mode, and such occurrences would be in the trace-level logs.

For example, avoiding foo!bar bang notation can go a long way towards improving parser performance across 350 modules, among other things. Looking at the trace-level logs would shed a bit of light on what's going on. In any case, we don't parse everything every time you refresh; we cache a lot of metadata, and only reprocess what needs to be reprocessed: parse trees for modules that haven't been modified are reused.

Also if a module is modified, other modules that reference it will be re-resolved to ensure the metadata cache remains consistent: refreshing should be less expensive when only a single module was modified since the previous parse, and if few modules reference that modified module.

Feel free to review the parsing process, we're very open to ideas!

ghost commented 5 years ago

You're right...sorry, I forgot to set to trace level. RubberduckLog.txt ...that's a big one. :)

The steps I took for this log:

  1. I opened the file and the VBA Editor without doing any changes.
  2. I initially refreshed the Code Explorer
  3. I refreshed again without any changes (in Code Inspector)
  4. I added '@Folder(Core) marks to some modules
  5. I refreshed again (in Code Inspector)

What I can see when I refresh the Code Inspector also the Code Explorer and Test Explorer will be refreshed. Does it just look like that in the GUI or is it the real case? If so, it probably would be an option to separate the "refresh" procedure for each feature...if there are no important relations.

Thanks for the hint to avoid the ! notation. I'm going to try this. I also will take a deeper look into the parsing process.

retailcoder commented 5 years ago

Most toolwindows will react to a change in the parser state; if a reparse is initiated from anywhere in the IDE, every toolwindow gets a message saying essentially so, meaning "this content is about to be invalidated". They're all using the same data. Inspections can't run off stale parse trees, the entire run would be useless - results that have been addressed since the last parse, results pointing to places that have moved down 3 lines since the last parse, etc. So whenever a fresh new parse completes, all the toolwindows receive a message saying "fresh data inbound!" and proceed to either just repaint themselves or sort out what's new and what's unchanged and what moved where: that way the state of the UI is always consistent with the parser state. The refresh command is the same wherever you click it from, ...except on the inspection results toolwindow - normally, that toolwindow only enters "busy" state (spinning duckies) after parsing successfully completes and the inspector begins firing every inspection (asynchronously) against it; you can configure Rubberduck to not do this if you want though - so that refresh button needed special attention - it sets the "busy" state immediately, so it visually looks like something is going on even though inspections aren't able to run yet; when the parse completes and everything is resolved, the inspector does its thing and eventually issues the bag of IInspectionResult objects that the toolwindow uses to render itself. If there were no code changes anywhere, then nothing got reparsed, but inspections ran - and that alone can indeed take a rather long while, and if there are thousands of results them displaying them can take .NET/WPF quite a while too - that's why every inspection is configurable: turn off the ones you don't want/need, so they don't even get to run. We try our best to make all inspections as efficient as possible, but some of them are inherently expensive, and a lot of the code-path-analysis ideas we have for the next phase of inspections will not be cheap either: most other addins offer inspections that don't (can't) go well beyond the "grammatical" aspect of things: it's one thing to say "Option Explicit is missing", it's another thing to say "this condition will always evaluate to False so this whole block is semantically unreacheable"...

That said, the inspector is inefficient: it's consistently scrapping all inspection results from the previous refresh, every time. This goes back to RD doing just a handful of "grammatical" inspections, and didn't scale very well to the 80+ inspections we have today. The reason it's doing that is because each result is tied to either a Declaration, or an IdentifierReference - the very things that the resolver produces out of the parse trees, that are stale as soon as a single character changes in a code pane: if parser state thinks variable foo is declared in line 12, but now it's one line 14, telling Rubberduck to remove that unused variable at line 12 could have disastrous consequences for the user code (although ctrl+z twice should undo any bad changes, assuming one single affected module). What we need is a way to make inspection results be able to be given an updated context after refreshing - but in order to do this we would need the inspection results closer to the parsing process, part of a module's metadata/cache; having a way to annotate parse tree nodes and say "this DimStmtContext node is declaring 3 variables, that's a problem", so the inspector could just grab the old/unchanged results right there in the parse trees instead of re-issuing them all.

I'll take a look at the log now =)