Genivia / ugrep

NEW ugrep 7.1: a more powerful, ultra fast, user-friendly, compatible grep. Includes a TUI, Google-like Boolean search with AND/OR/NOT, fuzzy search, hexdumps, searches (nested) archives (zip, 7z, tar, pax, cpio), compressed files (gz, Z, bz2, lzma, xz, lz4, zstd, brotli), pdfs, docs, and more
https://ugrep.com
BSD 3-Clause "New" or "Revised" License
2.66k stars 111 forks source link

Compressed search is not multi-threaded #403

Closed apprehensions closed 5 months ago

apprehensions commented 5 months ago

Use-case:

ugrep -z [pattern] chromium-125.0.6422.76.tar.xz 
genivia-inc commented 5 months ago

That is correct.

Searching inside a single tar is not multi-threaded. Searching multiple tars or other files is multi-threaded. A tarball is one sequential file without pointers to its contents to potentially distribute to threads to search in parallel. Only 7zip would potentially be searchable in parallel with multiple threads. Scanning the entire tar file first and then to send contents to threads would add too much overhead and slow it down.

Threads are used to search nested archives, but the search itself is not sped up in parallel.