stefankueng / grepWin

A powerful and fast search tool using regular expressions
https://tools.stefankueng.com/grepWin.html
GNU General Public License v3.0
1.74k stars 180 forks source link

Files are skipped for some search strings #409

Closed DoctorJools closed 1 year ago

DoctorJools commented 1 year ago

I'm searching a particular folder. When I'm searching for the regex

delete\s(from\s)?([)?dbo(])?.([)?T_ClientProvision_History(])?\b

it searches and the status bar at the bottom says "Searched 9270 files, skipped 0 files. Found 0 matches in 0 files." But if I search for the regex

delete\s+(from\s*)?(\w+)(.|\r\n)+([)?dbo(])?.([)?T_ClientProvision_History(])?\b\s+\2\s

the status bar at the bottom says "Searched 8921 files, skipped 349 files. Found 0 matches in 0 files."

Why does it skip 349 files? These results are completely repeatable - it's not that any process has a lock on any of the files.

stefankueng commented 1 year ago

the problem with the second regex is that it needs a lot more memory, i.e. stack space. This then leads to the regex engine aborting the search when it reaches the memory limit. That's when the file is marked as skipped in the search.

stefankueng commented 1 year ago

grepWin needs to show that the regex engine ran out of stack space somehow, not just show the files as skipped.

stefankueng commented 1 year ago

...and maybe increase the regex engines stack space

stefankueng commented 1 year ago

btw: the problem with your regex is the (.|\r\n)+ part. This is a repetition with a branch, and those always make the regex engine use up their stack, depending on the size of the searched file. Instead, simply use .+ and check the box named "Dot matches newline"

DoctorJools commented 1 year ago

Using .+ and "Dot matches newline" worked a treat. Thanks very much for your help.