Genivia / ugrep

NEW ugrep 6.5: a more powerful, ultra fast, user-friendly, compatible grep. Includes a TUI, Google-like Boolean search with AND/OR/NOT, fuzzy search, hexdumps, searches (nested) archives (zip, 7z, tar, pax, cpio), compressed files (gz, Z, bz2, lzma, xz, lz4, zstd, brotli), pdfs, docs, and more
https://ugrep.com
BSD 3-Clause "New" or "Revised" License
2.56k stars 109 forks source link

`ugrep` does not handle `procfs` pseudofiles correctly #350

Closed MrDrMcCoy closed 6 months ago

MrDrMcCoy commented 7 months ago

When I want to see mount options for a filesystem, I typically query /proc/mounts like so:

grep btrfs /proc/mounts

This does not work with ugrep, which returns nothing and exits with a status of 1 for the same query:

ugrep btrfs /proc/mounts

However, it returns the results as expected when cat abuse is introduced:

cat /proc/mounts | ugrep btrfs

Shell redirection also yields results as expected:

ugrep btrfs < /proc/mounts
genivia-inc commented 7 months ago

Two things to consider:

  1. ugrep does not read special devices by default, unlike grep. This is stated in the ugrep.com site and in the README. The reason is that devices may hang ugrep. That is not acceptable in the TUI that becomes unresponsive. Use option -Dread to read devices, like grep does, but by default. GNU grep hangs on named pipes when not in use, which is super annoying (some people figured this out after much frustration apparently).
  2. For recursively searching and descending into directories with special devices, see an older post #193 in which approaches and solutions were discussed. It is very important to make sure ugrep never hangs, a file handle is set to non-blocking, since nonblocking is ignored for regular files (ugrep.cpp:3575). Whereas standard input from redirect or pipe is blocking unless it is a CHR or FIFO device (ugrep.cpp:3991). This does not mean that the approach is perfect.
genivia-inc commented 7 months ago

There are different approaches possible, each with some caveats:

  1. skip empty regular files (stat returns zero size), even when special (like proc/mount)
  2. skip empty regular files, but not when -Dread to read special files (like proc/mount)
  3. do not skip empty files for non-recursive searches, but skip empty regular files and devices in recursive searches, unless -Dread to read special ones
genivia-inc commented 7 months ago

The following table clarifies when a search is performed. Note that GNU/BSD grep have changed over the years on how special files are dealt with e.g. when recursing open them as nonblocking.

Searching depends on Recurse, -Dread, Regular File if the regular file is Empty. For ugrep 4.5 we have:

image

Note that Recurse means option -r or if no target are specified. Searching a dir as a target is not recursing.

An alternative approach is to perform a search for scenario N N Y Y with the possibility of getting blocked, so the recursive case Y N Y Y opens the file nonblocking. Likewise for N Y Y Y and Y Y Y Y.

The first two red N in the table describe your case. The first without -Dread which ugrep does not set by default, but GNU grep does, and the second is with -Dread. The second red N should really be a Y actually. So this should be changed in the next release.

Perhaps the Alternative in this table is preferable:

image

When recursing we never block. When not recursing we do not block on empty files (procfs files) unless -Dread is specified to read them. With these parameters, with or without -Dread, the proc/mounts is always searched.

genivia-inc commented 6 months ago

Finishing up work on release 5 to include the proc FS handling changes in the Alternative table above, pending testing.

MrDrMcCoy commented 6 months ago

Thanks so much for this work!