Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.97k stars 555 forks source link

BBC: perl-5.34.0 broke Path::Class (Windows only, AFAICT) #20039

Open sisyphus opened 2 years ago

sisyphus commented 2 years ago

Module: Path-Class-0.37

Description

On MS Windows with perl-5.34.0 or later, we find that during the the running of the test suite. t/01-basic.t fails with:

t/01-basic.t .......... 1/78 Can't call method "relative" on an undefined value
at C:\sisyphusion\Path-Class-0.37\blib\lib/Path/Class/Entity.pm line 76.
t/01-basic.t .......... Dubious, test returned 22 (wstat 5632, 0x1600)

Steps to Reproduce

On MS Windows with perl-5.34.0 or later, run: cpan -i Path::Class

I believe it's probably up to Path::Class to fix this issue and there's a bug report filed against Path::Class at: https://github.com/kenahoo/Path-Class/issues/55

t/01-basic.t innovatively tries to get File::Spec to provide 'Unix' path formatting on Windows by resetting $^O to 'Unix', in a BEGIN{} block at the start of the file. This is actually quite successful until we get to perl-5.34.0. (The breakage, of course, probably occurred somewhere in the 5.33 devel cycle.)

I'm assuming that if File::Spec had really wanted to provide that capability, then it would have come up with something better than clobbering $^O. If someone can confirm that there's nothing here for the perl developers to attend to, then we can close this issue and move on.

There's also some additional discussion and observations at https://www.perlmonks.org/?node_id=11145885

Perl configuration

Summary of my perl5 (revision 5 version 34 subversion 0) configuration:

  Platform:
    osname=MSWin32
    osvers=6.1.7601
    archname=MSWin32-x64-multi-thread
    uname=''
    config_args='undef'
    hint=recommended
    useposix=true
    d_sigaction=undef
    useithreads=define
    usemultiplicity=define
    use64bitint=define
    use64bitall=undef
    uselongdouble=undef
    usemymalloc=n
    default_inc_excludes_dot=define
  Compiler:
    cc='gcc'
    ccflags =' -DWIN32 -DWIN64 -fdiagnostics-color=never -DPERL_TEXTMODE_SCRIPTS -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -D__USE_MINGW_ANSI_STDIO -fwrapv -fno-strict-aliasing -mms-bitfields'
    optimize='-s -O2'
    cppflags='-DWIN32'
    ccversion=''
    gccversion='10.2.0'
    gccosandvers=''
    intsize=4
    longsize=4
    ptrsize=8
    doublesize=8
    byteorder=12345678
    doublekind=3
    d_longlong=define
    longlongsize=8
    d_longdbl=define
    longdblsize=16
    longdblkind=3
    ivtype='long long'
    ivsize=8
    nvtype='double'
    nvsize=8
    Off_t='long long'
    lseeksize=8
    alignbytes=8
    prototype=define
  Linker and Libraries:
    ld='g++'
    ldflags ='-s -L"C:\perl-5.34.0\lib\MSWin32-x64-multi-thread\CORE" -L"C:\_64\winlibs-gcc-1020\mingw64\lib" -L"C:\_64\winlibs-gcc-1020\mingw64\x86_64-w64-mingw32\lib" -L"C:\_64\winlibs-gcc-1020\mingw64\lib\gcc\x86_64-w64-mingw32\10.2.0"'
    libpth=C:\_64\winlibs-gcc-1020\mingw64\lib C:\_64\winlibs-gcc-1020\mingw64\x86_64-w64-mingw32\lib C:\_64\winlibs-gcc-1020\mingw64\lib\gcc\x86_64-w64-mingw32\10.2.0 C:\_64\msys_1020\1.0\local\lib
    libs= -lmoldname -lkernel32 -luser32 -lgdi32 -lwinspool -lcomdlg32 -ladvapi32 -lshell32 -lole32 -loleaut32 -lnetapi32 -luuid -lws2_32 -lmpr -lwinmm -lversion -lodbc32 -lodbccp32 -lcomctl32
    perllibs= -lmoldname -lkernel32 -luser32 -lgdi32 -lwinspool -lcomdlg32 -ladvapi32 -lshell32 -lole32 -loleaut32 -lnetapi32 -luuid -lws2_32 -lmpr -lwinmm -lversion -lodbc32 -lodbccp32 -lcomctl32
    libc=
    so=dll
    useshrplib=true
    libperl=libperl534.a
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_win32.xs
    dlext=dll
    d_dlsymun=undef
    ccdlflags=' '
    cccdlflags=' '
    lddlflags='-mdll -s -L"C:\perl-5.34.0\lib\MSWin32-x64-multi-thread\CORE" -L"C:\_64\winlibs-gcc-1020\mingw64\lib" -L"C:\_64\winlibs-gcc-1020\mingw64\x86_64-w64-mingw32\lib" -L"C:\_64\winlibs-gcc-1020\mingw64\lib\gcc\x86_64-w64-mingw32\10.2.0"'

Characteristics of this binary (from libperl): 
  Compile-time options:
    HAS_TIMES
    HAVE_INTERP_INTERN
    MULTIPLICITY
    PERLIO_LAYERS
    PERL_COPY_ON_WRITE
    PERL_DONT_CREATE_GVSV
    PERL_IMPLICIT_CONTEXT
    PERL_IMPLICIT_SYS
    PERL_MALLOC_WRAP
    PERL_OP_PARENT
    PERL_PRESERVE_IVUV
    USE_64_BIT_INT
    USE_ITHREADS
    USE_LARGE_FILES
    USE_LOCALE
    USE_LOCALE_COLLATE
    USE_LOCALE_CTYPE
    USE_LOCALE_NUMERIC
    USE_LOCALE_TIME
    USE_PERLIO
    USE_PERL_ATOF
  Built under MSWin32
  Compiled at May 21 2021 17:06:16
  @INC:
    C:/perl-5.34.0/site/lib/MSWin32-x64-multi-thread
    C:/perl-5.34.0/site/lib
    C:/perl-5.34.0/lib/MSWin32-x64-multi-thread
    C:/perl-5.34.0/lib
kenahoo commented 2 years ago

Do we know what actual change in perl or File::Spec caused the difference in behavior?

Leont commented 2 years ago

Do we know what actual change in perl or File::Spec caused the difference in behavior?

It kind of has to be Cwd::realpath, my best guess would be c8c367581c3333c38d07481e2ea8d81171403c81 but I can't confirm that as I don't run Windows.

sisyphus commented 2 years ago

my best guess would be https://github.com/Perl/perl5/commit/c8c367581c3333c38d07481e2ea8d81171403c81 but I can't confirm that as I don't run Windows

I probably won't be able to test this until Monday.

sisyphus commented 2 years ago

I've taken a clone of current blead (commit c870f3e) and reverted https://github.com/Perl/perl5/commit/c8c367581c3333c38d07481e2ea8d81171403c81 by manually editing Cwd.xs.

I then built this perl and found that the perl test suite still passed all tests. I then installed that perl, but Path-Class-0.37 still fails in exactly the same way.

One thing I don't understand is that when I build Path-Class-0.37 on perl-5.34 or 5.36 by running perl Makefile.PL followed by make then t/01-basic.t fails in different ways - depending upon whether I run make test or perl -Mblib t/01-basic.t. When I run the former, Cwd.pm sees $^O as "Unix", but when I run the latter Cwd.pm sees $^O as "MSWin32".

Anyway - Path::Class is fine with PathTools-3.78, but is broken byPathTools-3.80. Cwd.pm did not alter between those 2 versions (except for version numbers), so if the problem is in Cwd, then it must be in the XS. Here's the diff of Cwd.xs between 3.78 and 3.80.

--- \comp-830\perl-5.32.1\dist\PathTools\Cwd.xs 2020-12-18 21:04:36 +1100
+++ \comp-1020\perl-5.34.0\dist\PathTools\Cwd.xs    2021-01-21 10:04:44 +1100
@@ -84,6 +84,9 @@
    unsigned symlinks;
    int serrno;
    char remaining[MAXPATHLEN], next_token[MAXPATHLEN];
+#ifdef PERL_IMPLICIT_SYS
+        dTHX;
+#endif

    serrno = errno;
    symlinks = 0;
@@ -119,15 +122,24 @@

             p = strchr(remaining, '/');
             s = p ? p : remaining + remaining_len;
+
             if ((STRLEN)(s - remaining) >= (STRLEN)sizeof(next_token)) {
                 errno = ENAMETOOLONG;
                 return (NULL);
             }
             memcpy(next_token, remaining, s - remaining);
             next_token[s - remaining] = '\0';
-            remaining_len -= s - remaining;
-            if (p != NULL)
-                memmove(remaining, s + 1, remaining_len + 1);
+
+            /* shift first component off front of path, including '/' */
+            if (p) {
+                s++; /* skip '/' */
+                remaining_len -= s - remaining;
+                /* the +1 includes the trailing '\0' */
+                memmove(remaining, s, remaining_len + 1);
+            }
+            else
+                remaining_len = 0;
+
             if (resolved[resolved_len - 1] != '/') {
                 if (resolved_len + 1 >= MAXPATHLEN) {
                     errno = ENAMETOOLONG;
@@ -166,8 +178,8 @@
             }
 #if defined(HAS_LSTAT) && defined(HAS_READLINK) && defined(HAS_SYMLINK)
             {
-                struct stat sb;
-                if (lstat(resolved, &sb) != 0) {
+                Stat_t sb;
+                if (PerlLIO_lstat(resolved, &sb) != 0) {
                     if (errno == ENOENT && p == NULL) {
                         errno = serrno;
                         return (resolved);
@@ -182,7 +194,7 @@
                         errno = ELOOP;
                         return (NULL);
                     }
-                    slen = readlink(resolved, symlink, sizeof(symlink) - 1);
+                    slen = PerlLIO_readlink(resolved, symlink, sizeof(symlink) - 1);
                     if (slen < 0)
                         return (NULL);
                     symlink[slen] = '\0';

Apart from the section that I've already tried reverting, is there something else that looks suspicious in that ? Interestingly #if defined(HAS_LSTAT) && defined(HAS_READLINK) && defined(HAS_SYMLINK) which was FALSE in perl-5.32.1 and earlier, is TRUE in 5.34.0 and later.

Cheers, Rob

kenahoo commented 2 years ago

Interestingly #if defined(HAS_LSTAT) && defined(HAS_READLINK) && defined(HAS_SYMLINK) which was FALSE in perl-5.32.1 and earlier, is TRUE in 5.34.0 and later.

That might be the most interesting fact, actually. Perhaps this branch was never exercised on Windows before, and now it is, and causes the failures we're seeing.

Leont commented 2 years ago

Interestingly #if defined(HAS_LSTAT) && defined(HAS_READLINK) && defined(HAS_SYMLINK) which was FALSE in perl-5.32.1 and earlier, is TRUE in 5.34.0 and later.

I guess that means PerlLIO_lstat gets confused and returns a failure, in which case the whole function returns NULL.

sisyphus commented 2 years ago

I guess that means PerlLIO_lstat gets confused and returns a failure

I inserted some warn() statements in Cwd.xs and established that PerlLIO_lstat(resolved, &sb) returns a non-zero value, and the condition(errno == ENOENT && p == NULL) is FALSE - and we immediately fall through to the explicit return (NULL); (at line 193 of the current blead version of Cwd.xs).

I've no idea what needs to be done.

Leont commented 2 years ago

I've no idea what needs to be done.

TBH I think this needs to get fixed on the Path::Class side.

kenahoo commented 2 years ago

TBH I think this needs to get fixed on the Path::Class side.

Not IMO. I think the Path::Class test exposed a regression in either perl or Cwd that needs to be understood first.

Leont commented 2 years ago

I think the Path::Class test exposed a regression in either perl or Cwd that needs to be understood first.

I think you voided the warranty the moment you did BEGIN { $^O = 'Unix' }. The obvious solution is to change the tests to use File::Spec instead of hardcoding unix paths.

kenahoo commented 2 years ago

I think you voided the warranty the moment you did BEGIN { $^O = 'Unix' }. The obvious solution is to change the tests to use File::Spec instead of hardcoding unix paths.

That is indeed obvious, but it's not the end of the story.

I understand I can fix the Path::Class tests by doing that, and I will do so. If nobody on the perl core team wants to find out what actually broke in the new perl release, and how that could affect other code besides Path::Class, then I guess that's all we'll do. I'm hoping that's not the case.

So I agree that the Path::Class tests should change in this fashion. But when I find a trout in my milk, I don't just discard it and declare everything fine, I want to understand how the trout got in the milk.

To push the animals-in-odd-environments metaphors farther - for the moment I consider the Path::Class to be a canary in the coal mine, indicating that something interface-incompatible was done to perl that breaks code, even if such code is doing odd things. Things that previously worked should in general not stop working with a new perl release unless an announcement was made about backward compatibility, full stop.

Leont commented 2 years ago

I inserted some warn() statements in Cwd.xs and established that PerlLIO_lstat(resolved, &sb) returns a non-zero value, and the condition(errno == ENOENT && p == NULL) is FALSE - and we immediately fall through to the explicit return (NULL); (at line 193 of the current blead version of Cwd.xs).

What does that resolved that PerlLIO_lstat/win32_lstat trips over look like?

xenu commented 2 years ago

What does that resolved that PerlLIO_lstat/win32_lstat trips over look like?

For example C:\Users\xenu\Documents\git\Path-Class/C:\Users\xenu\Documents\git\Path-Class.

It seems the problem is that the unix version of Cwd::realpath started validating paths on Windows when lstat became available in Perl 5.34.

Perl 5.32:

> perl -e 'BEGIN{$^O="Unix"} use File::Spec; use Cwd; print Cwd::realpath(Cwd::cwd()) // "<undef>"'
C:\Users\xenu\Documents\git\Path-Class/C:\Users\xenu\Documents\git\Path-Class
> perl -e 'BEGIN{$^O="Unix"} use File::Spec; use Cwd; print File::Spec->abs2rel(Cwd::realpath(Cwd::cwd()))'
C:\Users\xenu\Documents\git\Path-Class

blead:

> perl -e 'BEGIN{$^O="Unix"} use File::Spec; use Cwd; print Cwd::realpath(Cwd::cwd()) // "<undef>"'
<undef>
xenu commented 2 years ago

Naturally, the native version of Cwd::realpath works properly. This is on blead:

> perl -e 'use File::Spec; use Cwd; print Cwd::realpath(Cwd::cwd()) // "<undef>"'
C:/Users/xenu/Documents/git/Path-Class
Leont commented 2 years ago

What does that resolved that PerlLIO_lstat/win32_lstat trips over look like?

For example C:\Users\xenu\Documents\git\Path-Class/C:\Users\xenu\Documents\git\Path-Class.

It seems the problem is that the unix version of Cwd::realpath started validating paths on Windows when lstat became available in Perl 5.34.

Perl 5.32:

> perl -e 'BEGIN{$^O="Unix"} use File::Spec; use Cwd; print Cwd::realpath(Cwd::cwd()) // "<undef>"'
C:\Users\xenu\Documents\git\Path-Class/C:\Users\xenu\Documents\git\Path-Class
> perl -e 'BEGIN{$^O="Unix"} use File::Spec; use Cwd; print File::Spec->abs2rel(Cwd::realpath(Cwd::cwd()))'
C:\Users\xenu\Documents\git\Path-Class

blead:

> perl -e 'BEGIN{$^O="Unix"} use File::Spec; use Cwd; print Cwd::realpath(Cwd::cwd()) // "<undef>"'
<undef>

Looks like GIGO to me

sisyphus commented 2 years ago

What does that resolved that PerlLIO_lstat / win32_lstat trips over look like?

@xenu has probably already provided the info that you need.

The problematic test in t/01-basic.t is:

ok dir()->absolute->resolve, dir(Cwd::cwd())->resolve;

According to warn() statements that I inserted into t/01-basic.t, both dir()->absolute and dir(Cwd::cwd()) areC:\sisyphusion\Path-Class-0.37 - ie my CWD. ok() chokes on whichever of its arguments it evaluates first. That is the only test in the entire test suite that enters the #if defined(HAS_LSTAT) && defined(HAS_READLINK) && defined(HAS_SYMLINK) block. The test suite passes if I comment out that test.

Cheers, Rob

Leont commented 2 years ago

For example C:\Users\xenu\Documents\git\Path-Class/C:\Users\xenu\Documents\git\Path-Class.

What happens there is that the unix realpath code considers C:\Users\xenu\Documents\git\Path-Class a relative path (because it doesn't start with a /), so it prepends the current working directory, leading to the obvious bogus path. The later abs2rel will turn it into a valid (but absolute) path. SNAFU.

wchristian commented 2 years ago

Looks like GIGO to me

Very much agreed.

Do keep in mind that File::Spec is literally just an alias to return a File::Spec subclass based on the operating system currently running: https://metacpan.org/dist/PathTools/source/lib/File/Spec.pm

As such, if you lie to File::Spec intentionally AND also use actual paths from your system, then it is your personal duty to remain aware that you are feeding it garbage and to consider the resulting garbage correctly.

This is very much not a "blead breaks cpan" case (unless that includes "the cpan module is wrong"), and i say this as a very long-going windows Perl developer.

kenahoo commented 2 years ago

An observation - it appears that File::Spec does intend, at least somewhat, for users to be able to work with logical paths as represented on other OSes (see https://metacpan.org/pod/File::Spec#DESCRIPTION):

Since these functions are different for most operating systems, each set of OS specific routines is available in a separate module, including: File::Spec::Unix File::Spec::Mac File::Spec::OS2 File::Spec::Win32 File::Spec::VMS The module appropriate for the current OS is automatically loaded by File::Spec. Since some modules (like VMS) make use of facilities available only under that OS, it may not be possible to load all modules under all operating systems.

So it sort of says "it might work", but doesn't make a lot of guarantees.

As @wchristian points out,

File::Spec is literally just an alias to return a File::Spec subclass based on the operating system

so the hack of setting $^O = "Unix" is not actually what's causing this problem - the exact same problem will happen if/when we change Path::Class's test to Not Do That Anymore, and explicitly use File::Spec::Unix instead, as some have suggested. In other words, from the File::Spec perspective, hijacking $^O still seems to be [working] perfectly fine.

There are two other real problems to fix (as I'm not the first to observe): the first is the mixing of Unix paths & native paths, at line 96 here:

ok dir()->absolute->resolve, dir(Cwd::cwd())->resolve;

The 01-basic.t file shouldn't be doing anything that uses the local filesystem or OS, it should all be "logical" processing, so that line needs to be ripped out and possibly get a new home in 03-filesystem.t.

The other problem is that Cwd (unlike File::Spec, which by its mission never actually consults the OS through system or filesystem calls) might get pretty unhappy if $^O doesn't actually match the true OS it's running on. So the results of Cwd::cwd() in the first place are pretty tenuous and it shouldn't be called while $^O is hijacked, giving a second reason to rip out that line.

For completion - @xenu or someone else with easy access to Windows, if it's convenient, could you please run the following under both 5.32 and blead, to take File::Spec out of the equation?

perl -e 'BEGIN{$^O="Unix"} use Cwd; print Cwd::cwd() // "<undef>"'

sisyphus commented 2 years ago

On perl-5.32.0:

C:\_32>perl -e "BEGIN{$^O='Unix'} use Cwd; print Cwd::cwd() // '<undef>'"
C:\_32

Same result on blead:

C:\_32>perl -e "BEGIN{$^O='Unix'} use Cwd; print Cwd::cwd() // '<undef>'"
C:\_32

Cheers, Rob

kenahoo commented 2 years ago

Very interesting - so apparently Cwd::cwd() doesn't care about $^O being wrong on Windows (which seems kind of weird to me!), but Cwd::realpath() does. If I scanned through the code correctly, they would be defaulting to _perl_getcwd and _perl_abs_path, and the former is just calling the latter with an argument of '.'.

sisyphus commented 2 years ago

On perl-5.32.0:

C:\_32>perl -e "BEGIN{$^O='Unix'} use Cwd; print Cwd::_perl_abs_path('.') // '<undef>'"
/
C:\_32>perl -e "BEGIN{$^O='Unix'} use Cwd; print Cwd::_perl_getcwd(Cwd::_perl_abs_path('.')) // '<undef>'"
C:\_32

We see there that, on perl-5.32.0, _perl_abs_path('.') has returned a strange result. On perl 5.36 and blead we get:

C:\_32>perl -e "BEGIN{$^O='Unix'} use Cwd; print Cwd::_perl_abs_path('.') // '<undef>'"
/_32
C:\_32>perl -e "BEGIN{$^O='Unix'} use Cwd; print Cwd::_perl_getcwd(Cwd::_perl_abs_path('.')) // '<undef>'"
C:\_32

It's making my head spin ;-) I vote that we just make $^O readonly, and be done with the vagaries of allowing it to be overwritten.

Cheers, Rob

rjbs commented 1 year ago

@kenahoo Where do you think we are with this? "Path::Class is broken" feels like a blocker…

leonerd commented 1 year ago

I vote that we just make $^O readonly, and be done with the vagaries of allowing it to be overwritten.

Quite. It's a weird quirk of behaviour that it isn't readonly.

My suggestion for this would be to not alter the value, but simply skip the tests if it's not what was expected. E.g.

plan skip_all => "This test only runs on Windows" if $^O eq "MSWin32";
rjbs commented 1 year ago

Removed release blocker: this has been broken since v5.34. It'd be nice to get it fully resolved, but I don't see that happening right away.