Closed rafeca closed 5 years ago
I see that the test suite is passing. That's a great sign. π In addition to having a green test suite, can you describe the process you'll follow to verify that the change has not introduced any regressions? (This is something we ask contributors to provide in performance-related pull requests in atom/atom, but it doesn't look like we've incorporated that into atom/fuzzy-finder's pull request template yet.)
Thanks for the suggestion! I've just added some information on the PR description about the verification steps that I followed.
Description of the Change
Based on some profiling, the most expensive part of crawling the list of files is to check if each of the discovered files/folder are being ignored by
git
(theisIgnored()
method). This is done by calling thegit-utils
package which internally callslibgit2
.For medium to large repos (>30K files),
isIgnored()
takes ~80-90% of the crawling time π.Fortunately, the crawling logic is currently doing unnecessary checks to
isIgnored()
, which results in performing almost the double of checks than needed.By removing the additional call to
isIgnored()
we can cut by ~40% the time that takes to open the fuzzy finder for the first time on medium and large repos.This is a short table with the perf gains based on the repo size:
(In order to measure the time, I've added a
console.time()
call before creating the crawling task and aconsole.timeEnd()
on the callback of the task).The reason why this change is not impactful on small repos is because there seems to be some overhead when creating the
Task
, which dominates the time to crawl when there are not many files to check.This is some impactful low hanging fruit for https://github.com/atom/fuzzy-finder/issues/271, but there are a few other improvements that I think we can do to the fuzzy finder to make it faster.
Alternate Designs
N/A.
Benefits
Time to open the fuzzy finder after the editor launch gets reduced by ~40% on large repositories.
Verification Process
callback
to print all the project files that were crawled, stored them in disk and compared the list that gets generated with this PR against the one that gets generated in master. Verified that both lists are the same for several projects with different.gitignore
configs (Atom, Jest, Gecko-dev).pathLoaded()
method (1 and 2). Both of them pass the same variable as argument (pathToLoad
), which never gets reassigned. Before calling both callsites there's always a call toisIgnored()
with the same variable argument (pathToLoad
). This means thatisIgnored()
is called for the same exact paths than it was before (just less times).Possible Drawbacks
N/A.
Applicable Issues
https://github.com/atom/fuzzy-finder/issues/271