benjamn / recast

JavaScript syntax tree transformer, nondestructive pretty-printer, and automatic source map generator
MIT License
4.91k stars 347 forks source link

perf: avoid excessive linear scanning for large functions #1399

Open JoostK opened 2 months ago

JoostK commented 2 months ago

This commit restores the token indices for a node after they have been refined to span the node, instead of before the refinement. This avoids the potential for excessive searches for token indices, which is implemented using a linear scan.

It was attempted to convert the linear scans into binary searches (which an accompanying comment on findTokenRange indicates shouldn't be needed) and that also avoids the problem, but using the refined positions as implemented in this commit is even more effective, to the point where using a binary search is less efficient in my testing.

This change can achieve a substantial improvement, where an ~2.6MB file used to be parsed and cloned in ~120s, now reduced to ~1.5s, almost two orders of magnitude faster.

The perf test that tests the backbone.js file shows that performance improves from ~60ms to ~53ms, only a slight improvement.


The offending file I used for testing was ElkJS bundled into a single file: elkjs.js.zip. It used to take ~120s on a MacBook Pro M1 Max to clone the parsed tree, which has now been reduced to 1.2s for a 100x improvement.