beyondgrep / ack2

**ack 2 is no longer being maintained. ack 3 is the latest version.**
https://github.com/beyondgrep/ack3/
Other
1.48k stars 138 forks source link

Idea for a new filter: script #513

Closed hoelzro closed 7 years ago

hoelzro commented 9 years ago

I was out for a walk tonight, and I had a thought for a new type of filter for ack that would, at least in the interim, make a lot of users happy.

The new filter, "script", would run an external script and filter the resource if that script returns with a non-zero exit status.

Yes, this would be slow.

Yes, this probably isn't terribly Windows-friendly.

But this would enable a lot of users that have other tickets filed to get what they want without dramatically increasing the size of ack itself, or waiting on a plugin architecture that hasn't gotten off the ground yet. Also, it would enable users to extend ack without writing Perl, and perhaps encourage them to learn enough Perl to contribute if the functionality they add is well liked. Not to mention that it would make prototyping new features and trying them out a breeze.

Thoughts?

n1vux commented 9 years ago

"it would enable users to extend ack without writing Perl"

Be sure to allow an extra arg be set on type=filter definition to pass to a script to control it's operation.

"this probably isn't terribly Windows-friendly." Doesn't
my $rc = system($filter, $filename, $args) work on Windows?

We should supply a sample-stub filter-foo.pl to make it easy to build one in Perl, of course.

Be sure to be VERY explicit as to whether Non-zero selects or rejects a filename for the type.

If efficiency is any concern at all, the Xargs interface would be worth looking at. and see Xargs '0' option to handle whitespace in filenames (with find -print0) sigh

petdance commented 9 years ago

I would say in general it should match xargs as much as possible.

hoelzro commented 9 years ago

@n1vux @petdance

I was also thinking of if ack should pass any additional data to the script; for example, any extra command line arguments that ack doesn't understand.

petdance commented 9 years ago

On the other hand, we've been talking about doing a proper plugin interface forever, which would handle all of this.

hoelzro commented 9 years ago

@petdance Yes, but I think that this would be way easier to do and would be easier to prototype things in. Think of it as something to tide people over.

petdance commented 9 years ago

It might tide people over, but we'd forever be maintaining it.

n1vux commented 9 years ago

If we went down this path, for consistency with other filter types, i think we need :script: filter type name in the --type-set --type-add syntax for these. Something like --type-add=minjs:script:detect-minified,--js,{}

Supporting find -print0 | xargs -0 interface on filters would indeed complicate filter scripts in shell.

[ Note that this proposal would provide a built-in ShellShock exposure. Assuming we don't let shell re-parse args (per -0 and extra args notes), it's limited to shell-implemented filters. That's still another argument for a real plugin instead/only. ]

BUT ... is a type-add :script: even needed?
If someone needs to write a custom filter, they can do it with shell already, externally.

ack -g \\.txt | ack -xl -i Glenlivet
ack -g \\.txt | perl -nE 'say if /\b (?: 20 | 19 ) \d\d \b /x' | ack -xl -i Glenlivet
find . -name '*.txt' -print | ack -xl -i Glenlivet
find . -name '*.txt' -print0 | ack -xl -i Glenlivet  ## OOPS FAILS

(I note that a useless universal regex is required in ack -g --type=perl . and ack -g -k . to drive such filters; perhaps we should allow regexp to be optional with -g if --type or -k given, to list all such files.)

We should probably enhance -x so it accepts find -print0 | xargs -0 null-terminated input (without newlines); do we need -x0 or -x -0, or can we DWIM it ?

hoelzro commented 9 years ago

@n1vux Yeah, my mistake; that should've been colons in my example.

n1vux commented 9 years ago

( i was commenting on lack of keyword 'script', not the colons/commas )

hoelzro commented 9 years ago

Ah, right. I was missing that too. =)

hoelzro commented 9 years ago

There are definitely serious security implications here; as far as using find for custom filters at the moment, that results in some gnarly command lines, which means the user would have to deal with that, or create an alias or script to invoke them. Also, find + -x wouldn't allow a user to do something like ack --perl --myscriptything

n1vux commented 9 years ago

right, one has to re-filter find or ack -g .... hence that's

ack -g --perl . | perl myscriptything.pl | ack -x blah
ack -g --perl . | xargs myscriptything | ack -x blah
find . -type f -print0 | xargs -0 myscriptything | ack -x blah

( hence my suggestion to support -g0 and -x0 so the middle can be xargs -0 myscriptything to work with dastardly space-embedded filenames, and to make REGEX optional if -g is combined with ANY filter)

n1vux commented 9 years ago

summarizing - The Capability is worth having, but security, efficiency, and future support (post-plugins) suggest :script: might not be the way to do it.

Counter proposal -

[ Analogy, we've said pipe|ack -x is the way to match files with REGEXP prior to inspecting them, not on primary ack commandline, so even fancier filtering beyond REGEXP, requiring full scripting, really really belongs in pipe too. ]

[ Edited to add: launching user script with xargs -0 filterything gives it protection against file\ with\ spaces.txt as long as either if Perl or if shell and all "$1" uses quoted. ]