Closed rafeca closed 5 years ago
Small update: I just realized that when using ripgrep
we don't need to initialize the Git
native code (see commit) 😃
This is quite slow (specially for large repos) so it gives us even greater performance improvements (we can reduce the times on large repos by another 50% 🚀). I'm gonna update the results table with the new data
I think it's good to share a couple of screencasts to compare the previous experience on very large repos (gecko-dev
) with the one after this PR:
What happens now in master after opening Atom (or focusing Atom back):
The same flow with this PR:
Currently, if Atom gets opened from a non-git folder which has git repositories in some sub-folders, the .gitignore files from the sub-folders are ignored. With ripgrep they are taken into account. I think that this is an improvement.
I consider this a huge improvement. 😍
I think it's good to share a couple of screencasts to compare the previous experience on very large repos (gecko-dev) with the one after this PR...
Wow! 🏎💨
Next steps...
Thanks for listing these out. This seems promising and very much worth pursuing. :zap:
The testing so far seems to be going well:
ripgrep
, this way we have good coverage for ripgrep
.A couple of drawbacks that I've discovered today:
ripgrep
binary is quite big (~5MB for OSX, a little bit less for Windows). Once it gets gzipped
its size gets reduced to ~1.8MB, which means that the install files of Atom will grow by ~1.25% (based on the current size of ~144MB).vscode-ripgrep
downloads the needed binary from GitHub during installation (code). This causes failures on CI systems due to rate limits. I've fixed that for now by putting setting a GitHub token of mine as an env var on the travis config for fuzzy-finder
. We should either do the same in atom/atom
or stop using vscode-ripgrep
and build (and maintain) our own package.@jasonrudolph any thoughts about these drawbacks?
Nice work, @rafeca!
The
ripgrep
binary is quite big (~5MB for OSX, a little bit less for Windows). Once it getsgzipped
its size gets reduced to ~1.8MB, which means that the install files of Atom will grow by ~1.25% (based on the current size of ~144MB).
This seems like a worthwhile tradeoff to me. We're trading a relatively tiny bit of additional disk space and bandwidth, and it turn we spend less time waiting for fuzzy-finder results. 👍
I've fixed that for now by putting setting a GitHub token of mine as an env var on the travis config for fuzzy-finder. We should either do the same in atom/atom or stop using vscode-ripgrep and build (and maintain) our own package.
We can generate a scopeless API token for our build account and use that token for atom/fuzzy-finder builds and atom/atom builds. Let's coordinate via Slack to make this happen.
Awesome! I agree it's a worthwhile tradeoff 😃
I've been able to verify that Windows builds are packaged correctly with ripgrep
, which was the last item from my list of things to consider... So this PR is ready for review 😄
Oh there's something else, this feature is behind a config param which is off by default. Since it's a fuzzy-finder config param, the setting is quite hidden and really hard to discover (you have to go to settings -> packages -> search fuzzy finder and click on its settings.
This is how this setting looks like:
I have mixed feelings about this:
Couple of things that we could do:
Another time we implemented an experimental setting was when we started using fuzzaldrin-plus
as an alternative to fuzzaldrin
. The setting was off by default - you'll actually notice it right above your new setting in the last screenshot you posted.
Whenever people would open issues noting odd fuzzy finding behavior, we would ask them to try turning on the setting and reporting back. I think we can do the same thing here - keep the setting off by default, but advertise that the setting exists and will become the default in the near future (as we did for tree-sitter).
Looking back, what I think we could have done better is set a timeline for when to enable the new scoring method by default. As you can see, it's been three or four years, and I'm still not sure if it's enabled by default yet on all the packages that have it as an option. That was something we handled much better with tree-sitter.
I would opt towards making it off by default, advertising it when we can, and determining a target version to have it on by default. If by that target version it seems like it's still not ready for prime time, then we can re-evaluate.
@50Wliu that sounds good to me.
If I understand correctly our release cadence, if this PR lands now it'll get included in v1.37.0 which will be released around May. We could aim to enable it by default by v1.38.0 or v1.39.0 depending on the feedback/issues reported by users (this can be decided later though).
Oh there's something else, this feature is behind a config param which is off by default. Since it's a fuzzy-finder config param, the setting is quite hidden and really hard to discover...
Couple of things that we could do:
- ...
- ...
- Leave it as off but implement a popup that prompts users to turn it on if we detect that they open a medium or large repository. This is my favorite one, but if this takes much effort to implement then I don't think it's worth it.
I love this idea! It doesn't need to be part of this PR, but I think it's worth trying in a follow-up PR. I'd be happy to help. :smile:
For what it's worth, I think it will be fairly straightforward to implement using the notification API.
I love this idea! It doesn't need to be part of this PR, but I think it's worth trying in a follow-up PR. I'd be happy to help. 😄
For what it's worth, I think it will be fairly straightforward to implement using the notification API.
Awesome! We can implement it as a followup 😄
Read the release notes, followed the path to this PR, followed your guide where to find the new setting and hit the checkbox. I think there will be more people going down this route :)
Summary
This PR contains some minimal changes to demonstrate the usage of
ripgrep
on the fuzzy finder, so we can discuss the tradeoffs and steps needed to be able to ship this.This work comes as a follow-up from https://github.com/atom/fuzzy-finder/pull/367 after the suggestions of @smashwilson .
Benefits
Using
ripgrep
speeds up drastically the time to crawl medium and large repositories (we're taking about up to 14X faster times):(the measurement has been done the same way as in https://github.com/atom/fuzzy-finder/pull/366).
Possible Drawbacks
The crawling behaviour with
ripgrep
is slightly different than the one currently implemented. I don't think that the changes are important enough (or even bad), but it's important to list them:ripgrep
also returns symlink destination files (e.g if there's a symlink./foo.js
which points to./bar.js
, withripgrep
foo.js
can also be opened. I think this is an improvement.ripgrep
returns all results alphabetically ordered..gitignore
files from the sub-folders are ignored. Withripgrep
they are taken into account. I think that this is an improvement.Regarding 2., this change is quite noticeable, since the fuzzy finder currently displays the first 10 files returned by the crawler when it gets opened. This means that the first shown files will be slightly different than before (IMO we should change the default list of files that are shown when opening the fuzzy finder to show e.g recently opened files or currently opened files).
Alternate Designs
Other potential designs were presented here, and seems like
ripgrep
is the best tool to use at this moment.Regarding the current implementation, I've decided to use
vscode-ripgrep
, which is just a package that handles the download of theripgrep
binaries for several platforms. This has saved me a ton of time, but I'm not familiar enough with the process to install bundled extensions in Atom to know if this package is suitable for the job.Next steps
ripgrep
crawling.ripgrep
works well on different scenarios.ripgrep
binary can be used correctly from a production build (e.g does the asar packaging cause any issue?).Applicable Issues
https://github.com/atom/fuzzy-finder/issues/271