facebook / PathPicker

PathPicker accepts a wide range of input -- output from git commands, grep results, searches -- pretty much anything. After parsing the input, PathPicker presents you with a nice UI to select which files you're interested in. After that you can open them in your favorite editor or execute arbitrary commands.
https://facebook.github.io/PathPicker/
MIT License
5.11k stars 283 forks source link

Path matching with filesystem validation/checking #83

Closed jeremywiebe closed 9 years ago

jeremywiebe commented 9 years ago
On branch master
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

  modified:   Hello Cordova.xcodeproj/project.pbxproj
  modified:   Hello Cordova.xcodeproj/project.xcworkspace/xcuserdata/jeremy.xcuserdatad/UserInterfaceState.xcuserstate
  modified:   Hello Cordova/config.xml
  modified:   www/cordova_plugins.js
  modified:   ../../plugins/fetch.json
  modified:   ../../plugins/ios.json

Untracked files:
  (use "git add <file>..." to include in what will be committed)
  Hello Cordova/Plugins/com.phonegap.plugins.barcodescanner/
  Hello Cordova/Resources/scannerOverlay.xib
  www/plugins/com.phonegap.plugins.barcodescanner/
  ../../plugins/com.phonegap.plugins.barcodescanner/

no changes add to commit (use "git add" and/or "git commit -a")

See also attached screenshot.

screen shot 2015-05-11 at 11 54 15 am

pcottle commented 9 years ago

Yeah, this is a pretty tough problem to solve since once you start splitting on whitespace, you'll end up matching on everything, especially when looking at code diffs where files and code are mixed together.

however I do think we could do something here behind an experimental flag that would basically be a super-matchy regex that actually checks on the filesystem level if the match is a file (and then only includes that).

so let me retitle this issue, since i think we'll move in that direction regardless

MissCrazyRarity commented 9 years ago

It has issues as do we all and I need some help any sites known that I can go to learn

stephanschubert commented 9 years ago

I moved my comment from #94 to this one to keep it all in one place.

I ran git status in my ~/git/dotfiles repository and i'm using thoughtbot's rcm tool for management which is why the files are not prefixed with a .:

On branch master
Your branch is up-to-date with 'origin/master'.
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

    modified:   irbrc
    modified:   pryrc
    modified:   rspec

no changes added to commit (use "git add" and/or "git commit -a")

Only and/or in the last line is recognized - incorrectly.

pcottle commented 9 years ago

Single files not prepended by paths that don't have extensions (like irbrc) are super similar to just words. Sure we could do filesystem validation to check these cases, but it's important to note that if we want to hit matches like that, we have to either:

The latter is the better bet here. For the git status input you just mentioned, doing a basic back-of-the-envelope calc yields that we would have to do 50 filesystem checks for word-like files:

>>> input = '''
... On branch master
... Your branch is up-to-date with 'origin/master'.
... Changes not staged for commit:
...   (use "git add <file>..." to update what will be committed)
...   (use "git checkout -- <file>..." to discard changes in working directory)
... 
...     modified:   irbrc
...     modified:   pryrc
...     modified:   rspec
... 
... no changes added to commit (use "git add" and/or "git commit -a")
... '''
>>> len([word for word in input.split(' ') if len(word) > 0])
50

And that's for a 9 line input. now say if you pipe an entire git diff command into fp -- say its ~1k lines, that means we might do ~5k filesystem checks.

We'll move in this direction eventually but I do want to point out how overmatching can be dangerous

lastquestion commented 9 years ago

On the flip side, @pcottle , we can do it the reverse way: consider we only need to do "crazy stuff" if the word is basically a file in the current dir, because otherwise it must have some sort of path (whether relative or absolute).

So, instead of matching words and then doing a filesystem check, let's just grab a list of all files in the current dir FIRST, and then match against the words.

This is basically O(number of files in your local dir only), versus O(size of input).

pcottle commented 9 years ago

Yeah true, although we should have some kind of sanity check to make sure O(files in your local dir) isn't too hard.

Also git outputs a lot of commands relative to the top level working tree, so we would have to check that too.

But I do like the idea of flipping it around!

lastquestion commented 9 years ago

ls could be really slow on a mapped share or if there are tens of thousands of files in the local dir only. IMHO another interesting approach, which would be hard in the current arch: run it as a subprocess during picking, aka let's implement incremental searching. hard work. maybe another major release. prob put the ls under the same experimental flag as for the other file stuff ?

stephanschubert commented 9 years ago

I have no idea about performance patterns of git ls-files but maybe detection of .git and some clever combination of git ls-files's options could provide a middle way between "special case for git output" and "ls the whole current directory"?

pcottle commented 9 years ago

Ok guys almost everything is addressed here! Check out #114 for the implementation. I'm going to make a separate issue for the spaces and extension-less files, since that will require another regex