Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.94k stars 554 forks source link

close() no longer returns success after a failed read on a write-only filehandle #21187

Closed steve-m-hay closed 11 months ago

steve-m-hay commented 1 year ago

The program below reproduces the problem.

With 5.36.1 the final close() call returns 1 (success). The output is as expected:

open(): 1
print(): 1
<>: undef
close(): 1

With 5.38.0-RC2 it returns an empty string (failure). The output is:

open(): 1
print(): 1
<>: undef
close():
$!=Bad file descriptor, $^E=

It works (i.e. close returns 1) if I remove the deliberate failing attempt at reading from the write-only filehandle, so presumably this is somehow interfering with the return value of the close() call.

(If I insert a pause (e.g. a read on ) after the close() call then I find that it is possible to manually delete the file that was opened, which suggests that the close() operation has succeeded; it is just the return value from it that is wrong. By contrast, if I pause before the close() call then it is not possible to manually delete the file at that point, so hopefully that conclusion is correct.)

I have tested this with and without the recent CloseHandle() fix (commit 149c2e48b61bd93a9e4ee06b920ca68dd97b717c), but I have the same result (failure) either way. In fact, it also fails with 5.37.9, which is prior to the original CloseHandle() change (commit 8552f09f5cfe61a536a65f11290ef026f7aa0356) so the problem must be something else.

Testing further, I've narrowed it down a bit more: This worked in 5.37.3, but fails in 5.37.4.

use strict;
use warnings;
my $file = 'test.txt';
my $ret = open my $fh, '>', $file;
print "open(): $ret\n";
$ret = print $fh "WTF\n";
print "print(): $ret\n";
seek $fh, 0, 0;
{
# Suppress "Filehandle $fh opened only for output" warning
no warnings 'io';
$ret = <$fh>;
print "<>: " . (defined $ret ? $ret : "undef") . "\n";
}
$ret = close $fh;
print "close(): $ret\n";
print "\$!=$!, \$^E=$^E\n" unless $ret;

This is on Windows, with Visual Studio 2022 v17.4. perl -V as follows (all default build options):

Summary of my perl5 (revision 5 version 38 subversion 0) configuration:

Platform: osname=MSWin32 osvers=10.0.19045.3086 archname=MSWin32-x64-multi-thread uname='' config_args='undef' hint=recommended useposix=true d_sigaction=undef useithreads=define usemultiplicity=define use64bitint=define use64bitall=undef uselongdouble=undef usemymalloc=n default_inc_excludes_dot=define Compiler: cc='cl' ccflags ='-nologo -GF -W3 -MD -DWIN32 -D_CONSOLE -DNO_STRICT -DWIN64 -D_CRT_SECURE_NO_DEPRECATE -D_CRT_NONSTDC_NO_DEPRECATE -D_WINSOCK_DEPRECATED_NO_WARNINGS -DPERL_TEXTMODE_SCRIPTS -DMULTIPLICITY -DPERL_IMPLICIT_SYS' optimize='-O1 -Zi -GL -fp:precise' cppflags='-DWIN32' ccversion='19.34.31946' gccversion='' gccosandvers='' intsize=4 longsize=4 ptrsize=8 doublesize=8 byteorder=12345678 doublekind=3 d_longlong=undef longlongsize=8 d_longdbl=define longdblsize=8 longdblkind=0 ivtype='__int64' ivsize=8 nvtype='double' nvsize=8 Off_t='__int64' lseeksize=8 alignbytes=8 prototype=define Linker and Libraries: ld='link' ldflags ='-nologo -nodefaultlib -debug -opt:ref,icf -ltcg -libpath:"D:\Dev\Temp\perl\lib\CORE" -machine:AMD64 -subsystem:console,"5.02"' libpth="C:\Program Files\Microsoft Visual Studio\2022 17.4\Professional\VC\Tools\MSVC\14.34.31933\lib\x64" libs=oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib netapi32.lib uuid.lib ws2_32.lib mpr.lib winmm.lib version.lib odbc32.lib odbccp32.lib comctl32.lib msvcrt.lib vcruntime.lib ucrt.lib perllibs=oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib netapi32.lib uuid.lib ws2_32.lib mpr.lib winmm.lib version.lib odbc32.lib odbccp32.lib comctl32.lib msvcrt.lib vcruntime.lib ucrt.lib libc=ucrt.lib so=dll useshrplib=true libperl=perl538.lib gnulibc_version='' Dynamic Linking: dlsrc=dl_win32.xs dlext=dll d_dlsymun=undef ccdlflags=' ' cccdlflags=' ' lddlflags='-dll -nologo -nodefaultlib -debug -opt:ref,icf -ltcg -libpath:"D:\Dev\Temp\perl\lib\CORE" -machine:AMD64 -subsystem:console,"5.02"'

Characteristics of this binary (from libperl): Compile-time options: HAS_LONG_DOUBLE HAS_TIMES HAVE_INTERP_INTERN MULTIPLICITY PERLIO_LAYERS PERL_COPY_ON_WRITE PERL_DONT_CREATE_GVSV PERL_HASH_FUNC_SIPHASH13 PERL_HASH_USE_SBOX32 PERL_IMPLICIT_SYS PERL_MALLOC_WRAP PERL_OP_PARENT PERL_PRESERVE_IVUV PERL_USE_SAFE_PUTENV USE_64_BIT_INT USE_ITHREADS USE_LARGE_FILES USE_LOCALE USE_LOCALE_COLLATE USE_LOCALE_CTYPE USE_LOCALE_NUMERIC USE_LOCALE_TIME USE_PERLIO USE_PERL_ATOF USE_THREAD_SAFE_LOCALE Locally applied patches: RC2 Built under MSWin32 Compiled at Jul 1 2023 11:53:13 @INC: D:/Dev/Temp/perl/site/lib D:/Dev/Temp/perl/lib

steve-m-hay commented 1 year ago

Hmmm. This is presumably caused by commit 80c1f1e45e8ef8c27d170fae7ade41971fe20218, and appears to be intentional since a test case was added by commit 0b602161d2f4c5e6b22144e9f8b9f8128a35f440 which checks that closing the file should fail after trying to read from a write-only file.

I don't understand this change. The error is set as expected by the failing read, and can be checked after that. The close doesn't fail, so why should it return failure just because an earlier operation returned failure?!

We don't make the close() fail after a failed print() call on a read-only filehandle, so this odd behaviour seems inconsistent:

use strict;
use warnings;
my $file = 'test.txt';
my $ret = open my $fh, '<', $file;
print "open(): $ret\n";
{
# Suppress "Filehandle $fh opened only for input" warning
no warnings 'io';
$ret = print $fh "WTF\n";
print "print(): " . (defined $ret ? $ret : "undef") . "\n";
}
$ret = close $fh;
print "close(): $ret\n";
print "\$!=$!, \$^E=$^E\n" unless $ret;

Output is simply:

open(): 1
print(): undef
close(): 1
jkeenan commented 1 year ago

I modified @steve-m-hay's original script to make it more suitable for bisection, then bisected it. I can confirm that the following is the problematic commit:

commit 80c1f1e45e8ef8c27d170fae7ade41971fe20218 (HEAD, refs/bisect/bad)
Author:     Tony Cook <tony@develop-help.com>
AuthorDate: Tue Aug 16 15:52:04 2022 +1000
Commit:     Tony Cook <tony@develop-help.com>
CommitDate: Wed Aug 31 10:51:09 2022 +1000

    only clear the stream error state in readline() for glob()

However, it's not yet clear that this is a bug (in which case it should be deemed a blocker to perl-5.38.0). @tonycoz, can you comment?

steve-m-hay commented 1 year ago

It does seem like a bug to me.

If I understand the original problem description correctly, it was that the error status was getting cleared after the successful close() call. It should persist so that errors on the filestream are known after it has been closed, but there was no requirement for the close() call itself to return failure when it is succeeding, and it seems wrong that it now does. As noted above, it is inconsistent with the behaviour of print() on a read-only file, and this is not how C behaves (on my system, at least) either:

The following C program does roughly the equivalent of my original bug report above. Notice that errno (set by the failed attempt at reading on a write-only file descriptor) persists after the _close() call, but the return value from _close() itself is success, as it surely ought to be.

#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <sys\stat.h>
int main() {
    char* file = "test.txt";
    int fd;
    char buffer[3] = "WTF";
    unsigned int buffer_size = 3;
    int ret;
    fd = _open(file, _O_WRONLY | _O_CREAT | _O_TRUNC, _S_IWRITE);
    if (fd == -1)
        perror("_open() failed");
    else
        printf("_open() succeeded\n");
    ret = _write(fd, buffer, buffer_size);
    if (ret == -1)
        perror("_write() failed");
    else
        printf("_write(): %d bytes written\n", ret);
    ret = _read(fd, buffer, buffer_size);
    if (ret <= 0)
        perror("_read() failed");
    else
        printf("_read(): %d bytes read\n", ret);
    ret = _close(fd);
    if (ret == -1)
        perror("_close() failed");
    else
        printf("_close() succeeded\n");
    printf("errno=%d\n", errno);
}

Output is:

_open() succeeded
_write(): 3 bytes written
_read() failed: Bad file descriptor
_close() succeeded
errno=9
bram-perl commented 1 year ago

If I understand the original problem description correctly, it was that the error status was getting cleared after the successful close() call.

I believe that is incorrect. The error status was being cleared after a failed read. So by the time close() was called the error was already removed from the stream/handle and thus close() could not report the error.

For a bit more context see:

As for the inconsitency with print(): I don't know if that was considered/noticed before but one thing to keep in mind with print() is that buffering might be involved and that the error might not be noticed until the buffer is flushed (i.e. when close() is called) [Not sure if that is actually the case, I could be completely wrong with regards to this]

It's also mentioned in perldelta in the incompatible changes list: https://github.com/Perl/perl5/blob/blead/pod/perldelta.pod#readline-no-longer-clears-the-stream-error-and-eof-flags

... Since the error flag is no longer cleared calling close() on the stream may fail ...

steve-m-hay commented 1 year ago

If I understand the original problem description correctly, it was that the error status was getting cleared after the successful close() call.

I believe that is incorrect. The error status was being cleared after a failed read. So by the time close() was called the error was already removed from the stream/handle and thus close() could not report the error.

Yes, you're correct. The original problem (I'm looking at #20060) doesn't even have a close() call in it. My apologies for the confusion.

But I believe in cases (such as my example) where there is a close() call, we want the error state set by a failed read to persist after the close() call as well, which is what I was trying to say ;-) At least, that's what my example C program does, and I don't see a good reason for perl to behave differently.

For a bit more context see:

As for the inconsitency with print(): I don't know if that was considered/noticed before but one thing to keep in mind with print() is that buffering might be involved and that the error might not be noticed until the buffer is flushed (i.e. when close() is called) [Not sure if that is actually the case, I could be completely wrong with regards to this]

It's also mentioned in perldelta in the incompatible changes list: https://github.com/Perl/perl5/blob/blead/pod/perldelta.pod#readline-no-longer-clears-the-stream-error-and-eof-flags

... Since the error flag is no longer cleared calling close() on the stream may fail ...

Thanks. I had missed that entry in perldelta. However, that explanation of the change in the behaviour of close() makes no sense to me. It's obviously a good thing that the bug involving the error status getting cleared after a failed read is now fixed, but I don't see why that fix should have any knock-on effect on the return value of a subsequent successful close() call.

Why should close() fail just because of an earlier error on the stream? My C program shows it is possible for an error state to be set by a failing read, and remain set even after a successful close() call, and for that close() call itself to still return success.

I would not expect close() to return failure unless it has actually failed to close the filehandle (which is not the case here, as I observed in my original report), regardless of any error state set by earlier operations.

bram-perl commented 1 year ago

Why should close() fail just because of an earlier error on the stream? My C program shows it is possible for an error state to be set by a failing read, and remain set even after a successful close() call, and for that close() call itself to still return success.

I would not expect close() to return failure unless it has actually failed to close the filehandle (which is not the case here, as I observed in my original report), regardless of any error state set by earlier operations.

That is/would be inconsistent with print().. :-) Looking at one of my older comments in the PR: https://github.com/Perl/perl5/pull/20103#issuecomment-1219328667 Copy/pasting the most relevant part:

Using a tmpfs of 1M:

$ mkdir /tmp/bla
$ mount -o size=1M  -t tmpfs tmpfs /tmp/bla
$ perl -wle 'open my $fh, ">", "/tmp/bla/foo"; print $fh "a" x (2 ** 19) or warn "print 1 failed"; print $fh "a" x (2 ** 19) or warn > "print 2 failed"; close $fh or warn "close failed: $!"'
close failed: No space left on device at -e line 1.

$ strace perl -wle 'open my $fh, ">", "/tmp/bla/foo"; print $fh "a" x (2 ** 19) or warn "print 1 failed"; print $fh "a" x (2 ** 19) or warn "print 2 failed"; close $fh or warn "close failed: $!"'
...
write(3, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"..., 8192) = 8192
write(3, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"..., 8192) = 8192
write(3, "a\n", 2)                      = -1 ENOSPC (No space left on device)
close(3)                                = 0

Note:

  • at the system level: errors reported on the last write, no error on close;
  • at the 'Perl'-level no errors were reported on print, only on close;
steve-m-hay commented 1 year ago

Using a tmpfs of 1M:

$ mkdir /tmp/bla
$ mount -o size=1M  -t tmpfs tmpfs /tmp/bla
$ perl -wle 'open my $fh, ">", "/tmp/bla/foo"; print $fh "a" x (2 ** 19) or warn "print 1 failed"; print $fh "a" x (2 ** 19) or warn > "print 2 failed"; close $fh or warn "close failed: $!"'
close failed: No space left on device at -e line 1.

$ strace perl -wle 'open my $fh, ">", "/tmp/bla/foo"; print $fh "a" x (2 ** 19) or warn "print 1 failed"; print $fh "a" x (2 ** 19) or > warn "print 2 failed"; close $fh or warn "close failed: $!"'
...
write(3, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"..., 8192) = 8192
write(3, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"..., 8192) = 8192
write(3, "a\n", 2)                      = -1 ENOSPC (No space left on device)
close(3)                                = 0

Note:

  • at the system level: errors reported on the last write, no error on close;
  • at the 'Perl'-level no errors were reported on print, only on close;

Is the problem there just due to buffering? In my simple example of print() on a read-only filehandle the print() is impossible so immediately has an error. In your example the filehandle is writable, but you're writing a HUGE string to it. Perhaps the write operation is indeed successful as far as perl's print() is concerned, so it's correct to return success, but all it has done is put the data in a buffer which is waiting to be flushed... And then perl's close() call flushes the buffer, so it really is the close() that has the error in this case - in which case it is fair enough for close() to return failure. It has (presumably) closed the filehandle, but failed to flush buffered data to it first. In other words, in this case a system-level write operation (which is what fails due to no space) is actually a part of perl's close() operation, which is why the error comes from close() at the perl level.

Do you get the error reported from print rather than close (at the perl level) if your filehandle is unbuffered?

I still think that in both of my cases (read on a write-only filehandle, and print() on a read-only filehandle) close() should be returning success since the entire close() operation was successful.

bram-perl commented 1 year ago

Do you get the error reported from print rather than close (at the perl level) if your filehandle is unbuffered?

Yes. Using an older Perl (v5.20.2 because that's the system perl and that - obviously - doesn't contain that commit):

$ perl -wle 'open my $fh, ">", "/tmp/bla/foo"; select($fh); $|++; select(STDOUT); print $fh "a" x (2 ** 19) or warn "print 1 failed"; print $fh "a" x (2 ** 19) or warn "print 2 failed: $!"; close $fh or warn "close failed: $!"'
print 2 failed: No space left on device at -e line 1.
close failed: No space left on device at -e line 1.

Both print() and close() failed.

Strace:

write(3, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"..., 8192) = 8191
write(3, "a", 1)                        = -1 ENOSPC (No space left on device)
write(2, "print 2 failed: No space left on"..., 54) = 54
close(3)                                = 0
write(2, "close failed: No space left on d"..., 52) = 52```

System write() failed, system close() succeeded.

Note that only one write call failed. So it does not appear that close() attempted to do a write.

I still think that in both of my cases (read on a write-only filehandle, and print() on a read-only filehandle) close() should be returning success since the entire close() operation was successful.

Quoting from perldoc -f close: (emphasis mine)

Closes the file or pipe associated with the filehandle, flushes the IO buffers, and closes the system file descriptor. Returns true if those operations succeed and if no error was reported by any PerlIO layer.

It all depends on how one interprets that last part...

steve-m-hay commented 1 year ago

Do you get the error reported from print rather than close (at the perl level) if your filehandle is unbuffered?

Yes. Using an older Perl (v5.20.2 because that's the system perl and that - obviously - doesn't contain that commit):

$ perl -wle 'open my $fh, ">", "/tmp/bla/foo"; select($fh); $|++; select(STDOUT); print $fh "a" x (2 ** 19) or warn "print 1 failed"; print $fh "a" x (2 ** 19) or warn "print 2 failed: $!"; close $fh or warn "close failed: $!"'
print 2 failed: No space left on device at -e line 1.
close failed: No space left on device at -e line 1.

Both print() and close() failed.

It's good to see that print() now has the failure here, but odd that close() still reports failure too given that we think (below) close() isn't doing any further writing in this case :-/

Strace:

write(3, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"..., 8192) = 8191
write(3, "a", 1)                        = -1 ENOSPC (No space left on device)
write(2, "print 2 failed: No space left on"..., 54) = 54
close(3)                                = 0
write(2, "close failed: No space left on d"..., 52) = 52```

System write() failed, system close() succeeded.

Note that only one write call failed. So it does not appear that close() attempted to do a write.

I still think that in both of my cases (read on a write-only filehandle, and print() on a read-only filehandle) close() should be returning success since the entire close() operation was successful.

Quoting from perldoc -f close: (emphasis mine)

Closes the file or pipe associated with the filehandle, flushes the IO buffers, and closes the system file descriptor. Returns true if those operations succeed and if no error was reported by any PerlIO layer.

It all depends on how one interprets that last part...

Yes, that is ambiguous :-/ Is there any way to tell what (if any) error is being reported by a PerlIO layer that close() might see as grounds to return failure even though the rest of its operation has been successful?

At the very least I still don't like the inconsistency that close() (now) fails after a read on a write-only filehandle, but (still) succeeds after a write on a read-only filehandle.

steve-m-hay commented 1 year ago

I tried checking what state $! is in at various points with some extra output:

use strict;
use warnings;
my $file = 'test.txt';
my $ret = open my $fh, '>', $file;
print "open(): $ret\n";
$ret = print $fh "WTF\n";
print "print(): $ret\n";
seek $fh, 0, 0;
{
# Suppress "Filehandle $fh opened only for output" warning
no warnings 'io';
$ret = <$fh>;
print "<>: " . (defined $ret ? $ret : "undef") . ", \$!=$!\n";
}
print "before close(): \$!=$!\n";
$ret = close $fh;
print "close(): $ret, \$!=$!\n";

The old behaviour was that $! got set by the failed read and persisted thereafter, but close() reported success anyway:

open(): 1
print(): 1
<>: undef, $!=Bad file descriptor
before close(): $!=Bad file descriptor
close(): 1, $!=Bad file descriptor

The new behaviour is the same regarding $!, but now close() reports failure:

open(): 1
print(): 1
<>: undef, $!=Bad file descriptor
before close(): $!=Bad file descriptor
close(): , $!=Bad file descriptor

I wondered if the reported failure was because of the current error state (but see below...), so I tried clearing the error state by inserting "$! = 0;" immediately before the "before close" output. The old behaviour duly cleared $! and leaves it cleared, with close() returning success:

open(): 1
print(): 1
<>: undef, $!=Bad file descriptor
before close(): $!=
close(): 1, $!=

but I'm surprised to find that the new behaviour re-instates the old failure state of $!; and unsurprisingly close() still reports failure:

open(): 1
print(): 1
<>: undef, $!=Bad file descriptor
before close(): $!=
close(): , $!=Bad file descriptor

Is that expected?

Again, the print on a read-only filehandle case behaves differently: Not only does the close() succeed (even though $! is set by the failed print(), so close() failing above is not simply due to the error state being set; it is presumably due to some other error reported by a PerlIO layer according to perldoc quoted previously), but also if I clear that error state then it does not get re-instated in this case:

use strict;
use warnings;
my $file = 'test.txt';
my $ret = open my $fh, '<', $file;
print "open(): $ret\n";
{
# Suppress "Filehandle $fh opened only for input" warning
no warnings 'io';
$ret = print $fh "WTF\n";
print "print(): " . (defined $ret ? $ret : "undef") . ", \$!=$!\n";
}
print "before close(): \$!=$!\n";
$ret = close $fh;
print "close(): $ret, \$!=$!\n";

The old and new behaviours here are both:

open(): 1
print(): undef, $!=Bad file descriptor
before close(): $!=Bad file descriptor
close(): 1, $!=Bad file descriptor

and with the "$! = 0;" immediately before the "before close" output:

open(): 1
print(): undef, $!=Bad file descriptor
before close(): $!=
close(): 1, $!=

So in summary there also seem to be changes to how $! behaves in some cases (clearing it used to leave it cleared, but now the error state sometimes comes back), which doesn't seem to be covered by the perldelta entry, and isn't directly related to whether close() returns success or failure.

bram-perl commented 1 year ago

Do you get the error reported from print rather than close (at the perl level) if your filehandle is unbuffered? (...) Both print() and close() failed.

It's good to see that print() now has the failure here, but odd that close() still reports failure too given that we think (below) close() isn't doing any further writing in this case :-/

It's something I would expect. I think there are also several examples where print() is used without checking for errors and then a close() where it does check for errors. (I can't really find good examples of that off hand tho so I could be wrong.)

Yes, that is ambiguous :-/ Is there any way to tell what (if any) error is being reported by a PerlIO layer that close() might see as grounds to return failure even though the rest of its operation has been successful?

I believe this is the relevant code: doio.c which basically is:

                const bool prev_err = PerlIO_error(IoIFP(io));
#ifdef USE_PERLIO
                if (prev_err)
                    PerlIO_restore_errno(IoIFP(io));
#endif
                retval = (PerlIO_close(IoIFP(io)) != EOF && !prev_err);

If something set the error flag on the handle (such as after a failing read or write) then it will return an error.

At the very least I still don't like the inconsistency that close() (now) fails after a read on a write-only filehandle, but (still) succeeds after a write on a read-only filehandle.

Agreed, but thinking about it close() succeeding after a print() a read-only fh was already an inconsisntecy on earlier Perls. That is on older perls:

More about close() succeeded after print on read-only fh What is intresting: **strace print on read-only fh:** ``` open("test.txt", O_RDONLY|O_CLOEXEC) = 3 ioctl(3, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 0x7ffc9749eb40) = -1 ENOTTY (Inappropriate ioctl for device) lseek(3, 0, SEEK_CUR) = 0 fstat(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 write(1, "open(): 1\n", 10) = 10 write(2, "print failed: Bad file descripto"..., 58) = 58 close(3) = 0 ``` There was never a write-syscall to the fh **strace read on write-only fh:** ``` open("test.txt", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 3 ioctl(3, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 0x7ffcce00fb70) = -1 ENOTTY (Inappropriate ioctl for device) lseek(3, 0, SEEK_CUR) = 0 fstat(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 write(1, "open(): 1\n", 10) = 10 write(2, "read failed: Bad file descriptor"..., 57) = 57 close(3) = 0 write(2, "close failed: Bad file descripto"..., 58) = 58 ``` There was also no read-syscall on th fh So it must be handled differently somewhere in the code...
bram-perl commented 1 year ago

I tried checking what state $! is in at various points with some extra output:

You should also check the error state of the fh; i.e. $fh->error(); You can clear the error state by using $fh->clearerror();

Setting $! clears $! but does not clear the error state on $fh.

So in summary there also seem to be changes to how $! behaves in some cases (clearing it used to leave it cleared, but now the error state sometimes comes back), which doesn't seem to be covered by the perldelta entry, and isn't directly related to whether close() returns success or failure.

I haven't checked yet (and I'm running out of time for Today) but I suspect that also happened on older perls when print() was used. That is:

steve-m-hay commented 1 year ago

At the very least I still don't like the inconsistency that close() (now) fails after a read on a write-only filehandle, but (still) succeeds after a write on a read-only filehandle.

Agreed, but thinking about it close() succeeding after a print() a read-only fh was already an inconsisntecy on earlier Perls. That is on older perls:

  • when there is no space left on the device (i.e. disk full) then print() may/will fail (depending on the state of the buffer) and close() will fail
  • on a read-only file: print() fails but close() does not

That's a good point. I'm fixated on the inconsistency in my pair of very similar examples, but as you say there was already inconsistency anyway. The read on a write-only file case (where this all started...) is (now) inconsistent with one of the print() cases (namely, the one that it closely resembles), but in fact it used to be inconsistent with the other print case anyway because those two print cases (which haven't changed behaviour) are and were inconsistent themselves!

From that point of view this could be seen as a storm in a teacup, but it still troubles me that a (worthy) bug fix has had (I think it's fair to say?) unintended side-effects which have caused at least this bug report and possibly others too, and has necessitated an "incompatible changes" note in perldelta (which I failed to notice, and probably won't be alone in doing so).

steve-m-hay commented 1 year ago

I tried checking what state $! is in at various points with some extra output:

You should also check the error state of the fh; i.e. $fh->error(); You can clear the error state by using $fh->clearerror();

Setting $! clears $! but does not clear the error state on $fh.

Ah, ok. I've updated the test script to call "$fh->clearerr()" instead of "$! = 0" (since clearerr() seems to wipe that out too anyway), and now report error() as well. Also, it now grabs the values of these error states into variables, since print()ing things interferes with $!:

use strict;
use warnings;
my $file = 'test.txt';
my $ret = open my $fh, '>', $file;
print "open(): $ret\n";
$ret = print $fh "WTF\n";
print "print(): $ret\n";
seek $fh, 0, 0;
{
# Suppress "Filehandle $fh opened only for output" warning
no warnings 'io';
$ret = <$fh>;
}
my $errno = $!;
my $error = $fh->error();
print "<>: " . (defined $ret ? $ret : "undef") . ", \$!=$errno, error=$error\n";
# $fh->clearerr();
# $errno = $!;
# $error = $fh->error();
# print "after clearerr(): \$!=$errno, error=$error\n";
$ret = close $fh;
$errno = $!;
$error = $fh->error();
print "close(): $ret, \$!=$errno, error=$error\n";

The old behaviour has $! but no error() after the failed <>:

open(): 1
print(): 1
<>: undef, $!=Bad file descriptor, error=0
close(): 1, $!=, error=-1

whereas the new behaviour does have an error():

open(): 1
print(): 1
<>: undef, $!=Bad file descriptor, error=1
close(): , $!=Bad file descriptor, error=-1

I assume that's the bug fix in operation, and the presence of the error() now appears to be what causes close() to fail, as Bram discovered (see https://www.nntp.perl.org/group/perl.perl5.porters/2023/07/msg266597.html).

If I insert a clearerr() after the failed <> by uncommenting the four commented-out lines then the old behaviour is unchanged apart from the new output:

open(): 1
print(): 1
<>: undef, $!=Bad file descriptor, error=0
after clearerr(): $!=, error=0
close(): 1, $!=, error=-1

whereas the new behaviour now returns success from the close() as I have been wanting all along:

open(): 1
print(): 1
<>: undef, $!=Bad file descriptor, error=1
after clearerr(): $!=, error=0
close(): 1, $!=, error=-1

So this is the answer I've been looking for, I guess. It is alluded to in the perldelta entry, but I didn't quite take it in even after having been directed to this entry: "If your code encounters an error on the stream while reading with readline() you will need to call $handle->clearerr to continue reading."

Perhaps that text should be tweaked to note that you may also want to call clearerr() after your last read (after dealing with any errors from it appropriately), or else any such read errors may still be hanging around and interfere with the close() call, perhaps making you think the file close operation has failed when it hasn't ;-)

Weirdly, the print on a read-only filehandle case doesn't benefit at all from the clearerr() call:

use strict;
use warnings;
my $file = 'test.txt';
my $ret = open my $fh, '<', $file;
print "open(): $ret\n";
{
# Suppress "Filehandle $fh opened only for input" warning
no warnings 'io';
$ret = print $fh "WTF\n";
}
my $errno = $!;
my $error = $fh->error();
print "print(): " . (defined $ret ? $ret : "undef") . ", \$!=$errno, error=$error\n";
# $fh->clearerr();
# $errno = $!;
# $error = $fh->error();
# print "after clearerr(): \$!=$errno, error=$error\n";
$ret = close $fh;
$errno = $!;
$error = $fh->error();
print "close(): $ret, \$!=$errno, error=$error\n";

Old and new behaviours are both:

open(): 1
print(): undef, $!=Bad file descriptor, error=0
close(): 1, $!=, error=-1

and with the clearer() lines uncommented they're both:

open(): 1
print(): undef, $!=Bad file descriptor, error=0
after clearerr(): $!=Bad file descriptor, error=0
close(): 1, $!=Bad file descriptor, error=-1

which seems worse, not better! Why does $! not get cleared with clearerr() when it was cleared before?! Maybe it doesn't matter much since close() returns success in all cases here anyway, so probably nobody will go looking (although it now appears that it shouldn't be returning success here! -- this is actually the problem case, which I will probably raise a new GH issue for in due course!).

tonycoz commented 1 year ago
ret = _write(fd, buffer, buffer_size);
    if (ret == -1)
        perror("_write() failed");
    else
        printf("_write(): %d bytes written\n", ret);
    ret = _read(fd, buffer, buffer_size);
    if (ret <= 0)
        perror("_read() failed");
    else
        printf("_read(): %d bytes read\n", ret);

The equivalent to this would be syswrite()/sysread(), which don't set the error flag as far as I'm aware (they do set $!)

Changing that code to use stdio, which is closer to perl's buffered I/O, doesn't result in any errors: fread succeeds and the resulting text file becomes "WTF\0\0\0", which makes me echo the start of that string verbally.

If I add an fflush() (which C requires for mixed I/O on a stream open for update*) between the fwrite() and fread() the read fails and the stream error flag is set, as it is with perl's I/O.

The difference between them stdio and perl in this case is the behaviour of closing the file, and that's not a new change.

Why does $! not get cleared with clearerr() when it was cleared before?!

clearerr() only clears the error flag on the stream, it does not modify $!. It does mean that the saved errno won't be restored to $! when close() fails due to an existing error on the stream.

In this sample code the value of $! is simply being preserved - $!/errno is only set on failure, the close() succeeds, so the value of $! is irrelevant. (and there's no readline, so none of it is new behaviour)

From the discussion on list you seem to have come to terms with the current behaviour shipping in 5.38.

* which this isn't, I used fd = fopen(file, "w");, I'd expect an error but the standard is pretty vague about it

steve-m-hay commented 1 year ago
ret = _write(fd, buffer, buffer_size);
    if (ret == -1)
        perror("_write() failed");
    else
        printf("_write(): %d bytes written\n", ret);
    ret = _read(fd, buffer, buffer_size);
    if (ret <= 0)
        perror("_read() failed");
    else
        printf("_read(): %d bytes read\n", ret);

The equivalent to this would be syswrite()/sysread(), which don't set the error flag as far as I'm aware (they do set $!)

Ah. I did briefly ponder whether to use open() or fopen() in my C program, but went with open() and never looked back, which was silly.

Yes, rewriting the perl program with sysopen/syswrite/sysread I see that no error flag is set, and since I now understand that that is what is causing close to fail it's not surprise that close succeeds.

What's more interesting is to rewrite the C program with fopen/fwrite/fread/fclose, as you did below...

Changing that code to use stdio, which is closer to perl's buffered I/O, doesn't result in any errors: fread succeeds and the resulting text file becomes "WTF\0\0\0", which makes me echo the start of that string verbally.

I see the same thing, which is absolutely bizarre!

If I add an fflush() (which C requires for mixed I/O on a stream open for update*) between the fwrite() and fread() the read fails and the stream error flag is set, as it is with perl's I/O.

I found that adding an fseek() call before the fread() stops the weirdness. It was an omission from my C program; the Perl program did have a seek() call in it.

It's slightly odd is that C fread() doesn't set errno (whereas C _read() does!), though it does set the ferror() flag. Perl <> sets $! and error(), and the latter causes Perl's close() to fail unless clearerr() is called first.

The difference between them stdio and perl in this case is the behaviour of closing the file, and that's not a new change.

Why does $! not get cleared with clearerr() when it was cleared before?!

clearerr() only clears the error flag on the stream, it does not modify $!. It does mean that the saved errno won't be restored to $! when close() fails due to an existing error on the stream.

In this sample code the value of $! is simply being preserved - $!/errno is only set on failure, the close() succeeds, so the value of $! is irrelevant. (and there's no readline, so none of it is new behaviour)

From the discussion on list you seem to have come to terms with the current behaviour shipping in 5.38.

Yes. I think the key point was the text of commit e199e3bef188fd5eadcca420af186241e2773db1 which Bram found, which I wasn't aware of, and really isn't clear in the documentation of close(). Perhaps that wording should be improved, and maybe also the perldelta entry for this incompatible change -- to refer to that behaviour of close() and note that you may also therefore want to call clearerr() before closing the filehandle (at moment it only suggests calling clearerr() if you want to carry on reading).

  • which this isn't, I used fd = fopen(file, "w");, I'd expect an error but the standard is pretty vague about it
steve-m-hay commented 1 year ago

Largely for my own sanity, and possibly useful to anyone in the future looking back at this issue, I'm attaching here the four main programs relevant to this discussion: stdio versions in C and Perl (fopen/open) and lowio versions in C and Perl (open/sysopen). The stdio Perl version has the slightly surprising behaviour (compared to its C counterpart) of close() failing because it checks the error flag, which was evidently introduced by commit e199e3bef188fd5eadcca420af186241e2773db1 but isn't currently mentioned in the documentation. The lowio versions have no error flag, so don't have this difference. read_on_wofh.zip

jkeenan commented 11 months ago

There's been no further discussion in this ticket since July. In particular, no further discussion since @tonycoz merged https://github.com/Perl/perl5/pull/21298 on July 26. If @steve-m-hay, @bram-perl or anyone has a reason to keep the ticket open, please speak up; otherwise I'll close it in a few days. Thanks.