Open gcflymoto opened 3 years ago
Hey, thanks for the shout-out. One word of warning - most of my open PRs on this repo are pretty old and in most cases I have better versions of them in my own fork. I sorta lost interest in pushing PRs here due to the general trend of only small tweaks being accepted rather than larger features/fixes/refactors.
Anyway, if you want to keep rolling with PCRE2 in your fork, check out https://github.com/aswild/the_silver_searcher/commit/bbd3ee46113e3e7f13a75ba65fa35da06b3fe2d1 where I removed PCRE1 support and got considerable speedup by reducing heap thrashing on every call to pcre2_exec.
As far as searching big files goes, I think that's a fundamental/hard problem in ag. Except when searching stdin in stream mode, the whole file to be searched has to be buffered in memory, either with mmap or by reading it into the heap. There's even a check that the file length isn't greater than INT_MAX due to PCRE limitations*. Naturally, this leads to memory pressure and performance issues on some systems. This is further complicated by the way ag saves matches as pointers into the full file buffer, and since all matches are collected before stdout is mutex-locked, it means the whole buffer has to wait around in memory before any of its matches get printed to the console.
I don't mean to sound discouraging, these aren't insurmountable roadblocks, but they are the product of multiple interlocking facets of ag's design that would require significant effort to refactor.
* I just realized that this limitation isn't needed in PCRE2 and I should probably remove it in my fork.
Hi Allen, wow that's a fantastic repo you have. Yes, I'll attempt to merge in your changes, it will take me a while (vacation tomorrow) but excited to see it perform better.
There's even a check that the file length isn't greater than INT_MAX due to PCRE limitations*I removed that after applying your original PCRE2 patch. but they are the product of multiple interlocking facets of ag's design that would require significant effort to refactorGot it, I'm not sure yet if I would mind if AG had to read the whole file into the heap as long as it free'd it after searching. The majority of my "big" file searches are right over the "big_file" hump and under 20GB. The systems have enough memory to cover putting it in the heap, but I understand if the current design would start triggering limitations when searching large files.
PS. I see good performance gains when generating and consuming profiling results.
Test |
---|
ICC 2019
-O3 -ipo -axCORE-AVX2 -xSSE4.2 -qopt-report=4 -qopt-report-phase ipo | ICC 2019 -O3 -ipo -axCORE-AVX2 -xSSE4.2 -ansi-alias -qopt-report=4 -qopt-report-phase ipo -prof-use -prof-dir=$PWD | ICC 2019 -O2 -ipo -axCORE-AVX2 -xSSE4.2 -ansi-alias -qopt-report=4 -qopt-report-phase ipo -prof-use -prof-dir=$PWD | GCC 9.2-O3 -flto -g0 -march=native -mtune=native | GCC 9.2-O2 -flto -g0 -march=native -mtune=native | GCC 9.2-O3 -flto -g0 -march=native -mtune=native -fprofile-use |
---|
Regular
0:21.49 |
---|
0:24.00 | 0:21.30 |
---|
0:22.14 | 0:21.77 |
---|
0:21.81 |
---|
Small | 0:48.44 |
---|
0:53.45 | 0:36.53 |
---|
0:39.70 | 0:40.29 |
---|
0:31.08 |
---|
Big | 1:57.91 |
---|
1:46.18 | 1:34.93 |
---|
1:44.23 | 1:42.66 |
---|
1:22.76 |
On Wednesday, March 17, 2021, 5:04:26 PM CDT, Allen Wild @.***> wrote:
Hey, thanks for the shout-out. One word of warning - most of my open PRs on this repo are pretty old and in most cases I have better versions of them in my own fork. I sorta lost interest in pushing PRs here due to the general trend of only small tweaks being accepted rather than larger features/fixes/refactors.
Anyway, if you want to keep rolling with PCRE2 in your fork, check out @.*** where I removed PCRE1 support and got considerable speedup by reducing heap thrashing on every call to pcre2_exec.
As far as searching big files goes, I think that's a fundamental/hard problem in ag. Except when searching stdin in stream mode, the whole file to be searched has to be buffered in memory, either with mmap or by reading it into the heap. There's even a check that the file length isn't greater than INT_MAX due to PCRE limitations*. Naturally, this leads to memory pressure and performance issues on some systems. This is further complicated by the way ag saves matches as pointers into the full file buffer, and since all matches are collected before stdout is mutex-locked, it means the whole buffer has to wait around in memory before any of its matches get printed to the console.
I don't mean to sound discouraging, these aren't insurmountable roadblocks, but they are the product of multiple interlocking facets of ag's design that would require significant effort to refactor.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
AFAIK, files are free'd (or munmap'd) when ag is finished with them, i.e. after their results have been printed. But there can be one allocated file per worker thread, and by default there's one worker thread per CPU core. If you have a lot of CPUs and a lot of big files in the directory being searched, things could start adding up. At least with mmap, the kernel can page in/out file data as needed without exploding the heap and potentially thrashing swap.
I really love "ag" and use it every day for work as a developer, but it has limitations when searching big files. I have put together my own fork with many PRs which haven't made it into this repo, and it gets a lot closer. It includes PCRE2 from #1035 but it still has trouble with big files. Any way I can sponsor somebody and how do I go about it?
Much thanks to @aswild @napcode @afredd @mwagnell @jtamboli @thebaptiste for providing the PRs!!
@aswild Add support for libpcre2 (compiles and works) #1035 @aswild fix compile warnings on GCC10 #1398 @aswild search_file: check return code of read for failure #1260 @napcode Fixes overflows for large and compressed files #1221 @afredd Rework is_binary() #204 @mwagnell Segmentation fault when searching an xlsx file with -z option #1349 @jtamboli Include file names & line numbers when searching paths, even if stream is present #1433 @thebaptiste To be able to build with lzma.h missing or zlib.h missing or both mis… #1353