beyondgrep / website

The source code for the beyondgrep.com website
https://beyondgrep.com/
37 stars 20 forks source link

Add examples to the ack cookbook #103

Open petdance opened 7 years ago

petdance commented 7 years ago

ack 3, currently in development, will include a cookbook of examples for folks to try. Here's an example:

Find "goats" in every file that also contains "cows"

ack -l cows | ack -x goats

What examples do you have that can get added to the cookbook? Post them here.

petdance commented 7 years ago

From @n1vux: document use of lookbehind and other useful Perl Extended RE idioms in a separate doc section (have some a few bits now but scattered)

petdance commented 7 years ago

The example I always use in presentations:

Find all the headers used in your C programs and dedupe them:

ack '#include\s+<(.+)>' --cc --output='$1' | sort -u
petdance commented 7 years ago

Pull out examples from the main ::Manual, especially the "TIPS" section.

petdance commented 7 years ago

Searching for a method call:

ack -- '->method'

or

ack '[-]>method'
petdance commented 7 years ago

Search for all the files that mention Sys::Hostname but don't match hostname. Add -I because I have --smart-case on in my .ackrc.

ack -l Sys::Hostname | ack -x -L -I hostname

Related: Find files that are using both use base and use parent.

ack 'use base' -l | ack 'use parent' -x
n1vux commented 7 years ago

via @clmagic egrep -- "\t-\t-\t-\t-\t" entries.txt |sort -k3V # Get the entries with 4+ null fields and sort the entries by IPv4 (-V) in the 3rd column. (technically it's 4+ adjacent null fields)

petdance commented 7 years ago

Mention dirty and/or git diff --name-only.

hoelzro commented 7 years ago

Open list of matching files in Vim, searching for your search term:

$ ack my_search_term
<results>
$ vim $(!! -l) +/!$

Small caveat: Vim patterns and Perl regexes have some overlap, but they are different, so this doesn't work so great when you have a more complex regex as your search term.

petdance commented 7 years ago

See all the .vim files in your ~/.vim directory, except in the bundle/ directory.

ack -f --vim ~/.vim --ignore-dir=bundle
petdance commented 7 years ago

ack -g tax --ruby Finds all the ruby files that match /tax/.

vim $(ack -l taxes) open all the files that have taxes in them.

ack -f --perltest | xargs prove Find all the Perl test files and feed them to xargs which runs them all against the prove command.

petdance commented 7 years ago

We could have entire section just about ack -f and ack -g.

petdance commented 7 years ago

Find places where two methods with the same name are being called on the same line.

ack -- '->(\w+).*->\1\b'

petdance commented 7 years ago

Find all the places where I'm making a call like sort { lc $a cmp lc $b }. It gets false positive, but it's pretty useful.

ack '\bsort\b.+(\w+).+\bcmp\b.+\b(\1)\b'
petdance commented 7 years ago

Show a range of lines in a file: ack --lines=830-850 filename. And use -H to show what line numbers are.

petdance commented 7 years ago

Find modules that are not using Test::Warn:

ack -L 'warnings?_' $(ack -l Test::Warn)
petdance commented 7 years ago

Corrolary to ack -l cows | ack -x goats is ack -L cows | ack -x goats: Goats in files that do not contain cows.

teika-kazura commented 7 years ago

Hi. I save all the outputs of ack, together with pwd and the options, using a wrapper script. When "ack" is run without any argument, these logs are shown by the pager, from the latest first.

I put the following code in my .bashrc:

function ack(){
    local ackLogDir=/tmp/mylogs/Ack
    mkdir -p "$ackLogDir"
    chmod 777 "$ackLogDir" &> /dev/null
    if [[ $# == 0 ]]; then
        find "$ackLogDir" -type f | xargs ls -t |xargs less
        return
    fi
    local f="$( mktemp --tmpdir=$ackLogDir )"
    echo "# Pwd: `pwd`" > $f
    echo "# ack $@" >> $f
    command ack "$@" >> "$f" 2>&1
    less $f
}

(Instead of PAGER, less is hardcoded.) If ack itself supplies this functionality, together with some refinement, it can be much better. (So it might be better than putting in the cookbook.)

Thanks a lot for developing ack for so long. Regards.

petdance commented 7 years ago

Find all the subroutines in Perl tests and then give a count of many of each there are:

ack '^sub (\w+)' --perltest --output='$1' -h --nogroup | sort | uniq -c  | sort -n
petdance commented 7 years ago

Summarize the filetypes in your project

$ ack --noenv --show-type -f | perl -MData::Dumper -naE'++$n{$F[-1]}; END {print Dumper \%n}'
$VAR1 = {
          'xml' => 32,
          'sql' => 2,
          'shell' => 4,
          'php,shell' => 8,
          'yaml' => 1809,
          'php' => 7122,
          'css' => 360,
          'markdown' => 7,
          'html' => 7,
          '=>' => 1180,
          'json' => 69,
          'js' => 582
        };
petdance commented 7 years ago

Search all the files that don't have foo and show their usage of bar. Think of a better example.

ack -L foo | ack -x -w bar
petdance commented 7 years ago

Find the most-used modules in your codebase.

ack '^use ([\w+:]+)' --output='$1' -h --nogroup | sort | uniq -c  | sort -n
n1vux commented 7 years ago

We could have entire section just about ack -f and ack -g.

We could. I've tried organizing a litt e more towards how user thinks ....

n1vux commented 7 years ago

All the above have been added on my checkout , I'll proofread this weekend. Meantime i'll put my WIP on github.com/n1vux/ack3.git fork, branch 26_cookbook (diff)

Current outline is below
(rendered as outline via perldoc -o markdown lib/App/Ack/Docs/Cookbook.pm | ack -h '^#', perhaps that's a cookbook recipe too ? ).

Feedback on organization welcome. I haven't sorted the middle big 'section' topically yet ...

COOKBOOK

COMPOUND QUERIES

Find "goat(s)" or "cow(s(" or both

Find "goats" in every file that also contains "cows"

find goats in files that do not contain cows

Find "goats" in every farmish file

Find "goats" and "cows" in the same line, either order, as words

Search for all the files that mention Sys::Hostname but don't match hostname.

USING ACK EFFECTIVELY

Use the .ackrc file.

Use -f for working with big codesets

Use -Q when in doubt about metacharacters

Use ack to watch log files

use ack instead of find

Searching for a method call

use -w only for words

See all the .vim files in your hidden ~/.vim directory, except in the bundle/ directory.

Finds all the ruby files that match /tax/.

Find all the Perl test files and test them

Find places where two methods with the same name are being called on the same line.

Find all the places in code there's a call like sort { lc $a cmp lc $b }.

Show a range of lines in a file

Find modules that are importing but not actually using Test::Warn

TBD these are notes to be fleshed out TBD

TBD Mention dirty and/or git diff --name-only.

TBD Search all the files that don't have foo and show their usage of bar. Think of a better example.

EXAMPLES OF --output

Find all the headers used in your C programs and dedupe them

Find the most-used modules in your codebase.

Find all the subroutines in Perl tests and then give a count of many of each there are

VERY ELEGANT ACK

Open list of matching files in Vim, searching for your search term

Extending ack your way

find log lines with 4 nulls and sort by IP address

Summarize the filetypes in your project

Fowler's Folly

KWIC: KeyWord in Context index

TBD lookahead and lookbehind

petdance commented 7 years ago

Extract part of a line from a logfile

ack '>>ip(\S+).+rq"/help' --output='$1' -h
petdance commented 7 years ago

From https://stackoverflow.com/questions/45538755/bash-text-extracting

I have this very long line of info in 1 file, and i want to extract it, so that the end result look like this output.txt ? Please help !

Input.txt

{"city":"london","first name":"peter","last name":"hansen","age":"40"},
{"city":"new york","first name":"celine","last name":"parker","age":"36"]

Output.txt

peter (40) celine (36)

The sneaky and potentially unsafe way to do it is:

ack '"first name":"([^"]+)".+"age":"(\d+)"' input.txt --output='$1 $2' 
n1vux commented 7 years ago

bill-n1vux #ack how is that "potentially unsafe" ?

andylester Inaccurate. Might not always work. Unsafe is wrong word Arguments might not be in that order.

bill-n1vux oh right, JSON like Perl has no guarantee of sane ordering of hash elements

I'm sure it's doable correctly with either-or and context ...

echo '[{"city":"london","first name":"peter","last name":"hansen","age":"40"},   {"city":"new york","last name":"parker","age":"36","first name":"celine"}]' \
| ack --output '$1$4($2$3)' '{.*?"first name":"([^"]*)".*?age":"(\d+)|{.*?"age":"(\d+)".*?first name":"([^"]*?)"'
peter(40)
celine(36)

(doesn't scale well to 3! or greater possible field orders to extract. at which point plain Perl with either any real JSON module or cheating any escaped quotes to Perls and EVALing into Aref of Hrefs is necessary. )

n1vux commented 7 years ago

Add Elegant nearly- and not-ugly-and- exact solutions to https://github.com/beyondgrep/ack2/pull/646 that require neither hypothetical, \n as OR nor --fgrep-f .

Note: glark has greppish -f so is a partial alternative for this usecase (with <(process substitution) but glark doesn't have --passthrough so still not a full solution)

petdance commented 7 years ago

We need lots of -f and -g examples. Also, interesting that ag does not have -f. If you want ack -f with ag you have to do ag -g ""

petdance commented 7 years ago

Inventory all PHP sqldo functions

ack 'sqldo_\w+' --php -o -h | sort -u
petdance commented 7 years ago

Inventory of methods called on users:

ack 'user->(\w+)' --output='$1' -h | sort | uniq -c
n1vux commented 7 years ago

from ack-users

I am trying to get the words in file1 that are not in file2. saving result in file3 I need equivalent awk command to the following grep grep -F -x -v -f file1 file2 > file3 grep takes time and is being killed because file2 is about 40000 long ,and file1 is about 25000

Incomm` that's the -23 option. Column 1 is words only in file 1. -23 is minus 2,3, omit columns 2 (file 2 words) and 3 (both files words).

The other key to comm is files must be sorted by natural sort order. So the shell command or alias needed is comm -23 <(sort $file1) <(sort $file2) with modern bash <() command substitution as file-pipes.

(I may have added this one already but cataloged here to check)

n1vux commented 7 years ago

Cheat to simulate "within 5 lines"

ack -i -C5 eliphalet | ack -i -C5 ricker | ack -i -C5 'eliphalet|ricker'

(Will get some false hits if either string appears in filenames alas)

petdance commented 7 years ago

Look for a method you're not sure of the name of.

I was looking for a method that I knew was called "something_follows", so I looked for method invocations like that: ack -- '->.+_follows\b'

petdance commented 7 years ago

So I can have regex like 'abc\K(def)(?=ghi)'. That will highlight ONLY 'def' in the text but only if that string is preceeded by 'abc' and 'ghi' follows.

https://news.ycombinator.com/item?id=15433310

n1vux commented 7 years ago

PR#106 has above comments added to Cookbook.pm . Items with TBD TODO still in in file are reactioned :+1: above in #26 (this comment stream)

petdance commented 7 years ago

What if you have a prohibition in your code against variables that use camelCase? Find any strings that use it:

ack  -I '\b[a-z]+[A-Z]+[a-z]+'
petdance commented 7 years ago

Pick a random source file:

ack -f | shuf | head -n 1
petdance commented 6 years ago

I had a big log file of errors that looked like this:

    # (6819:1) Warning: <td> attribute "bgcolor" had invalid value "D4ED91" and has been replaced
    # (6825:1) Warning: <td> attribute "bgcolor" had invalid value "C6E2FF" and has been replaced
    # (6828:1) Warning: <td> attribute "bgcolor" had invalid value "D4ED91" and has been replaced
    # (6834:1) Warning: <td> attribute "bgcolor" had invalid value "C6E2FF" and has been replaced

To summarize the invalid values:

ack 'invalid value "(\w+)"' --output='$1' smoke.log | sort | uniq -c
n1vux commented 6 years ago

What states have both a Shelbyville and a Springfield? wget https://www2.census.gov/geo/docs/reference/codes/files/national_places.txt comm <(ack springfield national_places.txt | cut -d| -f1 | sort | uniq) <(ack shelbyville national_places.txt | cut -d| -f1 | sort | uniq)

(doc for data: https://www.census.gov/geo/reference/codes/place.html )

(Springfield is not most common. It's Seventh.) perl -lan -F\| -Mstrict -Mwarnings -E 'our %Count; my ($st, $stfips, $plfips, $name,$type, $funcstat,$county)=@F; next unless $funcstat eq q(A); $name =~ s/ \s [a-z]+$ //x; $Count{$name}++;' -E 'BEGIN { our %Count; }' -E ' END {our %Count; say qq($Count{$}\t$) for sort keys %Count;} ' national_places.txt | sort -nr | head

n1vux commented 6 years ago

stdbuf(1G) https://www.gnu.org/software/coreutils/manual/html_node/stdbuf-invocation.html as described at https://blog.plover.com/Unix/stdio-buffering.html (MJD) could be useful in an ack (or grep) pipeline.

edited: Nope. Not likely useful... ack uses sysread which I think means the C FILE streams are bypassed, same as cat does? I'm skipping this one.

n1vux commented 6 years ago

UTF-16/UCS-2 ugly workaround could be a whole Cookbook section ... it's too big and too rare to be a FAQ, it's not asked that frequently.

See also #152 #153 https://groups.google.com/forum/#!topic/ack-users/qidCgv3S5Uo and Ack2-484 https://github.com/beyondgrep/ack2/issues/484#issuecomment-126477173 (different workaround)

petdance commented 6 years ago
$ ack something
... # eyeball that it's what I'm looking for
$ vim $(!! -l)
n1vux commented 6 years ago

I have tentatively assigned remaining comments above to the existing sections, and have a plan for exposition. TBD marks existing placeholders, also listed

USING ACK EFFECTIVELY

EXAMPLES OF C<< --output >>

VERY ELEGANT ACK

WHEN TO DO SOMETHING ELSE

INFREQUENTLY Asked Questions (new section)

petdance commented 5 years ago

ack ‘(.+),(.+)’ --output=‘mv “$1” “$2”’ filemappings.csv | sh -x

n1vux commented 5 years ago

from Yagamy Light on Ack-Users - replacement for find -exec sed

ack -l --print0 pattern | xargs -r0 perl -i -pe 's/pattern/replacement/g'

and note that xargs -r avoids running command with stdin when no filenames provided, almost always a good idea to use that.

petdance commented 5 years ago

Add my .ackrc files to the Ack Cookbook as examples. Explain why.

n1vux commented 5 years ago

To see a pattern in context of subroutine / method it's in

xxx='^(package|sub|event|before|after|around|BUILD|DEMOLISH|DESTROY)\b'
pat='get_ready|do_it' ; ack --perl "$pat" -l | ack -x "$xxx|$pat"

To easily reuse a command with history when ^W or double-click and overtype would wipe out too much context , not just the keyword you want to retype, use (?x-: ) eXtended syntax option in PerlRE to make spaces not count, so can separate the Keyword from adjacent RE features.

ack --perl '(?x-: \b Keyword \s* => )'

n1vux commented 4 years ago

(from above xref) To have filenames as headings but no line numbers (until there's a --[no-]linenumbers option), piping to cut -d: doesn't work since that inlines the filenames, but putting the cut into the ack pager option avoids interlining the filename:

ack get_file_id --pager='cut -d: -f2|less -iR'

Not intuitive but it works!

n1vux commented 4 years ago

Add to remove duplicates (if it already exists, which should be demonstrating --count -1 --max-count=1) the edge case workarounds

ack -h [-o|--output] thing | sort | uniq [-c]

And there's also ack { -f | -g } $where -print0 | xargs -0 cat | ack -c thing -

None of these will work as a real-time filter for tailing live logfiles, alas; if that is what is needed, it may be better to implement that directly as a post-filter: alias uniq_filter="perl -nlE 'say unless \$seen{\$_}++;'" tail --follow=${logfile} --retry | ack -h ${badness} | uniq_filter

n1vux commented 4 years ago

colorize matches while filtering with -v

https://github.com/beyondgrep/ack3/issues/315