StephanTLavavej / mingw-distro

MinGW distro build scripts.
494 stars 55 forks source link

grep 2.11 broke -r for Windows; grep 3.1 can be patched #6

Closed StephanTLavavej closed 7 years ago

StephanTLavavej commented 8 years ago

The distro's grep is frozen at 2.10. I'd like to update it, but 2.11 broke grep -r on Windows, and I don't know how to fix it.

I reported this upstream, but the maintainers didn't care: https://lists.gnu.org/archive/html/bug-grep/2014-01/msg00032.html

BenHanson commented 7 years ago

I tried building 2.2 under MSVC and got it building and running. That version seems to simply not support the -r flag. Later versions seemingly support -r but do not build under MSVC due to other GNU/*nix dependencies. Have you tried simply using the latest version under your distro?

StephanTLavavej commented 7 years ago

I build grep with my distro, not with MSVC. grep 2.10 needed minor patches to work (with more extensive patching to enable color output), see my grep.patch. The problem is that 2.11's sources changed significantly enough that my patch became inapplicable.

I haven't tried building grep recently; when I get a chance I'll try 2.27.

StephanTLavavej commented 7 years ago

grep 3.0 is still broken, although the output changed at some point in the past 3 years:

C:\Temp\gcc>grep --version
grep (GNU grep) 2.10
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Mike Haertel and others, see <http://git.sv.gnu.org/cgit/grep.git/tree/AUTHORS>.

C:\Temp\gcc>grep -rP meow DELME
DELME/BAR/five.txt:meow 5
DELME/BAR/five.txt:meow 6
DELME/FOO/one.txt:meow 1
DELME/FOO/one.txt:meow 2

C:\Temp\gcc>dest\bin\grep --version
grep (GNU grep) 3.0
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Mike Haertel and others, see <http://git.sv.gnu.org/cgit/grep.git/tree/AUTHORS>.

C:\Temp\gcc>dest\bin\grep -rP meow DELME
grep: warning: DELME/BAR: recursive directory loop
grep: warning: DELME/FOO: recursive directory loop
Paxxi commented 7 years ago

Tested this with msys2 and it seems to work as intended

C:\code\kodi\project\BuildDependencies\msys64\usr>bin\grep.exe --version
bin/grep (GNU grep) 2.22
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Mike Haertel and others, see <http://git.sv.gnu.org/cgit/grep.git/tree/AUTHORS>.

C:\code\kodi\project\BuildDependencies\msys64\usr>bin\grep.exe -rP test bin
bin/aclocal:#  - if magic keyword 'latest' is found, pick the latest version that exists
bin/aclocal:#  - Try to locate the latest version of automake that exists and run it
StephanTLavavej commented 7 years ago

That looks like a grep built for MSYS2, not a Windows-native grep - therefore it doesn't help what I'm trying to do.

Paxxi commented 7 years ago

ahh my bad.

Gave it a new try building it and managed to reproduce the issue.

Issue seems to be in lib/fts-cycle.c:enter_dir where it checks st_dev and st_ino which always seem to return the same value. These seem to be populated in lib/fts.c:fts_build. Populating them using nFileIndex should solve the issue. Not sure I have the time to actually write a patch but hopefully this is of some help.

StephanTLavavej commented 7 years ago

Thanks - I looked around a bit, but didn't see an obvious place to patch, especially considering the sizes of the data types.

Paxxi commented 7 years ago

One possible solution would be to use the two halves of nFileIndex and populate std_dev and st_ino with each half ignoring the volume identifier. This wouldn't be foolproof but I believe good enough unless encountering devices mounted to folders and actually running grep on a parent folder of one of these.

StephanTLavavej commented 7 years ago

According to MinGW's <sys/types.h>, ino_t is unsigned short, which is kind of problematic for that idea.

StephanTLavavej commented 7 years ago

https://debbugs.gnu.org/cgi/bugreport.cgi?bug=16444 contains additional replies to my original Jan 2014 mail. The last reply was in Mar 2015, where Paul Eggert said:

Yes, this is a known deficiency in MinGW that is documented in Gnulib, here: http://www.gnu.org/software/gnulib/manual/html_node/sys_002fstat_002eh.html (search for "st_ino"). When we discovered this deficiency years ago, we decided that it'd be too much of a pain to work around it then. Please see the thread starting here: http://lists.gnu.org/archive/html/bug-gnulib/2009-09/msg00252.html in which a patch was installed that tries to work around some of the MinGW deficiency; unfortunately the patch broke a lot of things and was backed out. I'm not optimistic about a fix this time either. The deficiency should be fixed in MinGW, so that functions like 'stat' behave as programs expect. In the meantime, I suggest not using recursive grep in MinGW.

Today, that gnulib documentation link says:

Portability problems fixed by Gnulib module sys_stat, together with module windows-stat-inodes: On Windows platforms (excluding Cygwin), st_ino is always 0.

This windows-stat-inodes module is very new. "windows-stat-inodes: New module." and "same-inode: Adapt for windows-stat-inodes." were committed on May 14, 2017.

It's possible that this gnulib change will fix grep. grep recently updated gnulib, so building grep HEAD should be sufficient.