ggreer / the_silver_searcher

A code-searching tool similar to ack, but faster.
http://geoff.greer.fm/ag/
Apache License 2.0
26.1k stars 1.42k forks source link

fgrep vs ag #639

Open lmanchon opened 9 years ago

lmanchon commented 9 years ago

--Hi,

just a single test on my computer with 32 cores(Intel Xeon E5-2650 @ 2.60GHz) and 128 Gb of ram, using 8Gb text file: 8589934454 big_file.txt

time ag --nocolor --workers=1 --parallel hello big_file.txt 33554432:hello1073741824 67108864:hello2147483648 100663296:hello3221225472 134217728:hello4294967296 167772160:hello5368709120 201326592:hello6442450944 234881024:hello7516192768 268435456:hello

real 0m32.217s user 0m30.928s sys 0m1.249s

export LC_LOCAL=C time fgrep -n hello big_file.txt 33554432:hello1073741824 67108864:hello2147483648 100663296:hello3221225472 134217728:hello4294967296 167772160:hello5368709120 201326592:hello6442450944 234881024:hello7516192768 268435456:hello

real 0m9.071s user 0m7.790s sys 0m0.936s

it seems that fgrep is faster than ag !

ggreer commented 9 years ago

First, this is not the use-case ag was designed for. If you want to search a single huge file, just use grep. Many of ag's tricks (such as threads and ignore files) can't be used for a single file.

Second, I don't get anything like these results on my system. 90% of the time, people creating issues saying, "It's slower." are mistaken. Either they're not benchmarking correctly (multiple runs, hot vs cold caches, etc), or they're testing different things (counting line numbers vs not).

And finally, I can't see what the problem is if you don't give profiling info. It would really help to see output from Valgrind, gprof, instruments.app, or something similar.

gcflymoto commented 9 years ago

@ggreer I can reproduce @lmanchon experiment.. I went ahead and profiled with gprof and callgrind.

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls   s/call   s/call  name
 74.33     15.75    15.75        1    15.75    15.75  print_file_matches
 25.67     21.19     5.44        9     0.60     0.60  boyer_moore_strncasestr
  0.00     21.19     0.00       25     0.00     0.00  log_debug
  0.00     21.19     0.00       10     0.00     0.00  ag_asprintf
  0.00     21.19     0.00       10     0.00     0.00  free_strings
  0.00     21.19     0.00        7     0.00     0.00  load_ignore_patterns
  0.00     21.19     0.00        6     0.00     0.00  ag_malloc
  0.00     21.19     0.00        5     0.00     0.00  ag_strdup
  0.00     21.19     0.00        4     0.00     0.00  ag_calloc
  0.00     21.19     0.00        2     0.00     0.00  ag_realloc
  0.00     21.19     0.00        2     0.00     0.00  cleanup_ignore

And interesting callgrind output

--------------------------------------------------------------------------------
             Ir 
--------------------------------------------------------------------------------
178,420,829,018  PROGRAM TOTALS

             Ir  file:function
--------------------------------------------------------------------------------
149,786,985,268  src/print.c:print_file_matches [./the_silver_searcher.git/ag]
 28,633,115,349  src/util.c:boyer_moore_strncasestr [./the_silver_searcher.git/ag]
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
-- Auto-annotated source: src/util.c
--------------------------------------------------------------------------------
             .  /* Copy-pasted from above. Yes I know this is bad. One day I might even fix it. */
             .  const char *boyer_moore_strncasestr(const char *s, const char *find, const size_t s_len, const size_t f_len,
           144                                      const size_t alpha_skip_lookup[], const size_t *find_skip_lookup) {
           456  => ???:mcount (9x)
             .      ssize_t i;
            18      size_t pos = f_len - 1;
             .  
 3,579,139,360      while (pos < s_len) {
14,316,557,720          for (i = f_len - 1; i >= 0 && tolower(s[pos]) == find[i]; pos--, i--) {
            56  => ???:__ctype_tolower_loc (8x)
             .          }
             .          if (i < 0) {
            72              return s + pos + 1;
             .          }
10,737,418,026          pos += ag_max(alpha_skip_lookup[(unsigned char)s[pos]], find_skip_lookup[i]);

--------------------------------------------------------------------------------
-- Auto-annotated source: src/print.c
--------------------------------------------------------------------------------
<skip>

             5      context_prev_lines = ag_calloc(sizeof(char *), (opts.before + 1));
           211  => ./the_silver_searcher.git/src/util.c:ag_calloc (1x)
             .  
51,271,171,273      for (i = 0; i <= buf_len && (cur_match < matches_len || lines_since_last_match <= opts.after); i++) {
25,769,803,362          if (cur_match < matches_len && i == matches[cur_match].start) {

<skip>

25,769,803,362          if (cur_match < matches_len && i == matches[cur_match].end) {
             .              /* We found the end of a match. */
            24              cur_match++;
             8              in_a_match = FALSE;
             .          }
             .  
             .          /* We found the end of a line. */
26,306,674,274          if (buf[i] == '\n' && opts.before > 0) {
             .              if (context_prev_lines[last_prev_line] != NULL) {
             .                  free(context_prev_lines[last_prev_line]);
             .              }
             .              /* We don't want to strcpy the \n */
             .              context_prev_lines[last_prev_line] = ag_strndup(&buf[prev_line_offset], i - prev_line_offset);
             .              last_prev_line = (last_prev_line + 1) % opts.before;
             .          }
             .  
16,642,997,996          if (buf[i] == '\n' || i == buf_len) {

<skip>

   536,870,896              } else if (lines_since_last_match <= opts.after) {
             .                  /* print context after matching line */
             .                  if (opts.print_path == PATH_PRINT_EACH_LINE) {
             .                      print_path(path, ':');
             .                  }
             .                  print_line_number(line, sep);
             .  
             .                  for (j = prev_line_offset; j < i; j++) {
             .                      fputc(buf[j], out_fd);
             .                  }
             .                  fputc('\n', out_fd);
             .              }
             .  
   536,870,912              prev_line_offset = i + 1; /* skip the newline */
   268,435,456              line++;
   805,306,368              if (!in_a_match && lines_since_last_match < INT_MAX) {
   805,306,368                  lines_since_last_match++;
             .              }
             .              /* File doesn't end with a newline. Print one so the output is pretty. */
   536,870,912              if (i == buf_len && buf[i] != '\n' && !opts.search_stream) {
gcflymoto commented 9 years ago

@lmanchon By default "ag" is doing a case insensitive search and calling boyer_moore_strncasestr because the query is all lowercase. I was surprised to find that by default it will call boyer_moore_strncasestr for lowercase literal searches. @ggreer Wouldn't that make it slower in the nominal case?

    if (opts.casing == CASE_SMART) {
        opts.casing = is_lowercase(opts.query) ? CASE_INSENSITIVE : CASE_SENSITIVE;
    }

ack does not default to CASE_SMART:

http://linux.die.net/man/1/ack

--smart-case, --no-smart-case
    Ignore case in the search strings if PATTERN contains no uppercase characters. This is similar to "smartcase" in vim. This option is off by default. 

@lmanchon Can you try with CASE_SENSITIVE search and see if it makes a difference?

 ./ag --nocolor -s --workers=1 --parallel hello big_file.txt
gcflymoto commented 9 years ago

@lmanchon @ggreer using PGO (profile guided optimizations) improve the performance of ag by 33% down to ~20 seconds from ~30 seconds on this particular benchmark. fgrep is still the fastest with 5.1 seconds