Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.9k stars 540 forks source link

open unexpectedly creates temporary files with undefined variables sometimes #22385

Closed mauke closed 1 month ago

mauke commented 2 months ago

Description perldoc -f open:

As a special case the three-argument form with a read/write mode and the third argument being undef:

open(my $tmp, "+>", undef) or die ...

opens a filehandle to a newly created empty anonymous temporary file. (This happens under any mode, which makes +> the only useful and sensible mode to use.) You will need to seek to do the reading.

This feature was added in 5.8.0 and is intended to only trigger when a literal undef is passed as the filename. It does not happen if a variable whose contents are undef is used:

$ perl -we 'open(my $fh, "+>", my $bogus) or die $!'
Use of uninitialized value $bogus in open at -e line 1.
No such file or directory at -e line 1.
$ perl -we 'my @bogus = undef; open(my $fh, "+>", $bogus[0]) or die $!'
Use of uninitialized value $bogus[0] in open at -e line 1.
No such file or directory at -e line 1.
$ perl -we 'my %bogus = (a => undef); open(my $fh, "+>", $bogus{a}) or die $!'
Use of uninitialized value $bogus{"a"} in open at -e line 1.
No such file or directory at -e line 1.

Steps to Reproduce However, if you use an array or hash element that doesn't exist at all (as opposed to existing but containing undef), the special behavior kicks in and open silently succeeds, creating a temp file:

$ perl -we 'my @bogus; open(my $fh, "+>", $bogus[0]) or die $!'
$ perl -we 'my %bogus; open(my $fh, "+>", $bogus{a}) or die $!'
$ perl -we 'open(my $fh, "+>", $ARGV[0]) or die $!'
$ 

Even this behavior is not entirely consistent, however. If you use a runtime call to &CORE::open, it starts failing again (but with duplicate warning messages for some reason):

$ perl -we 'my @bogus; &CORE::open(my $fh, "+>", $bogus[0]) or die $!'
Use of uninitialized value in open at -e line 1.
Use of uninitialized value in open at -e line 1.
No such file or directory at -e line 1.
$ perl -we 'my %bogus; &CORE::open(my $fh, "+>", $bogus{a}) or die $!'
Use of uninitialized value in open at -e line 1.
Use of uninitialized value in open at -e line 1.
No such file or directory at -e line 1.

Expected behavior A call to open that does not pass a literal undef, but an element of an array or hash, should complain about the undefined value and fail. Basically like the &CORE::open version, but there I'd expect to see only one copy of the warning.

Perl configuration

Summary of my perl5 (revision 5 version 40 subversion 0) configuration:

  Platform:
    osname=linux
    osvers=6.5.0-10040-tuxedo
    archname=x86_64-linux-thread-multi-ld
    uname='linux luum 6.5.0-10040-tuxedo #44 smp preempt_dynamic wed may 8 17:36:39 utc 2024 x86_64 x86_64 x86_64 gnulinux '
    config_args='-de -Dprefix=/home/mauke/perl5/perlbrew/perls/perl-5.40.0 -Dcc=cgcc -Dman1dir=none -Dman3dir=none -Dusethreads -Duselongdouble -Aoptimize=-flto -Aeval:scriptdir=/home/mauke/perl5/perlbrew/perls/perl-5.40.0/bin'
    hint=recommended
    useposix=true
    d_sigaction=define
    useithreads=define
    usemultiplicity=define
    use64bitint=define
    use64bitall=define
    uselongdouble=define
    usemymalloc=n
    default_inc_excludes_dot=define
  Compiler:
    cc='cgcc'
    ccflags ='-D_REENTRANT -D_GNU_SOURCE -fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'
    optimize='-O2 -flto'
    cppflags='-D_REENTRANT -D_GNU_SOURCE -fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include'
    ccversion=''
    gccversion='11.4.0'
    gccosandvers=''
    intsize=4
    longsize=8
    ptrsize=8
    doublesize=8
    byteorder=12345678
    doublekind=3
    d_longlong=define
    longlongsize=8
    d_longdbl=define
    longdblsize=16
    longdblkind=3
    ivtype='long'
    ivsize=8
    nvtype='long double'
    nvsize=16
    Off_t='off_t'
    lseeksize=8
    alignbytes=16
    prototype=define
  Linker and Libraries:
    ld='cgcc'
    ldflags =' -fstack-protector-strong -L/usr/local/lib'
    libpth=/usr/local/lib /usr/lib/x86_64-linux-gnu /usr/lib /usr/lib64
    libs=-lpthread -ldb -ldl -lm -lcrypt -lutil -lc
    perllibs=-lpthread -ldl -lm -lcrypt -lutil -lc
    libc=/lib/x86_64-linux-gnu/libc.so.6
    so=so
    useshrplib=false
    libperl=libperl.a
    gnulibc_version='2.35'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs
    dlext=so
    d_dlsymun=undef
    ccdlflags='-Wl,-E'
    cccdlflags='-fPIC'
    lddlflags='-shared -O2 -flto -L/usr/local/lib -fstack-protector-strong'

Characteristics of this binary (from libperl): 
  Compile-time options:
    HAS_LONG_DOUBLE
    HAS_STRTOLD
    HAS_TIMES
    MULTIPLICITY
    PERLIO_LAYERS
    PERL_COPY_ON_WRITE
    PERL_DONT_CREATE_GVSV
    PERL_HASH_FUNC_SIPHASH13
    PERL_HASH_USE_SBOX32
    PERL_MALLOC_WRAP
    PERL_OP_PARENT
    PERL_PRESERVE_IVUV
    PERL_USE_SAFE_PUTENV
    USE_64_BIT_ALL
    USE_64_BIT_INT
    USE_ITHREADS
    USE_LARGE_FILES
    USE_LOCALE
    USE_LOCALE_COLLATE
    USE_LOCALE_CTYPE
    USE_LOCALE_NUMERIC
    USE_LOCALE_TIME
    USE_LONG_DOUBLE
    USE_PERLIO
    USE_PERL_ATOF
    USE_REENTRANT_API
    USE_THREAD_SAFE_LOCALE
  Built under linux
  Compiled at Jun  9 2024 23:04:17
  %ENV:
    PERLBREW_BASHRC_VERSION="0.74"
    PERLBREW_HOME="/home/mauke/.perlbrew"
    PERLBREW_MANPATH="/home/mauke/perl5/perlbrew/perls/perl-5.40.0/man"
    PERLBREW_PATH="/home/mauke/perl5/perlbrew/bin:/home/mauke/perl5/perlbrew/perls/perl-5.40.0/bin"
    PERLBREW_PERL="perl-5.40.0"
    PERLBREW_ROOT="/home/mauke/perl5/perlbrew"
    PERLBREW_VERSION="0.94"
    PERLDOC="-oman"
    PERL_UNICODE="SAL"
  @INC:
    /home/mauke/perl5/perlbrew/perls/perl-5.40.0/lib/site_perl/5.40.0/x86_64-linux-thread-multi-ld
    /home/mauke/perl5/perlbrew/perls/perl-5.40.0/lib/site_perl/5.40.0
    /home/mauke/perl5/perlbrew/perls/perl-5.40.0/lib/5.40.0/x86_64-linux-thread-multi-ld
    /home/mauke/perl5/perlbrew/perls/perl-5.40.0/lib/5.40.0
jkeenan commented 2 months ago

Could you provide some TODO-ed unit tests that will currently fail but will pass if this problem is corrected? Thanks.

mauke commented 1 month ago

Like this? https://github.com/Perl/perl5/compare/blead...mauke:perl5:open-undef-todo-tests

tonycoz commented 1 month ago

pp_aelem and pp_alemfast return &PL_sv_undef for non-lvalue array lookups, so the check PerlIO_openn() does for undef succeeds.

For the CORE::open() call:

I'm not fond of this kind of magical "only-this-SV"-ness, though making it work for any undefined SV might cause other problems (eg. code that expects the open to fail for generic undef values).

I don't see a simple fix.

mauke commented 1 month ago

I have a couple of (ignorance-based¹) possible solutions:

  1. &CORE::open gets it right, so make open also put its third argument in lvalue(-ish) context.
  2. Have open introspect its op tree at runtime to see whether the &PL_sv_undef came from a literal undef arg.
  3. Have the parser recognize open(..., ..., undef) and emit a different op (pp_open_tmpfile?). Remove the special handling from regular open.
  4. Like (3), but have a checker rewrite the op.
  5. Like (3) or (4), but have it set a special flag on the open; don't define a separate op. In open, only recognize &PL_sv_undef as a temp file marker if that flag is set.

¹ meaning I don't know how to actually implement them, which always makes it easier to say "just do X!"

tonycoz commented 1 month ago

While CORE::open() doesn't display this misbehaviour for the subscripting ops, it does display it for other OPs that return &PL_sv_undef:

$ perl -we 'my $x; CORE::open(my $fh, "+>", delete $x{unknown}) or die $!'
Name "main::x" used only once: possible typo at -e line 1.

I considered some of these options.

Realize that both the normal open() op and CORE::open() end up being handled by pp_open, and the OP_OPEN generated by coresub_op() has generally default flags, so it wouldn't have a OPpSEEN_SV_UNDEF private flag. This applies to the alternate OP too.

We could invert that flag, so OPpNO_SV_UNDEF_SEEN and only do the temp file handling when that flag is missing, but it won't fix the delete case above.

Also the undef handling is actually done by PerlIO, so if pp_open decides not to do the temp file handling it will need to replace the &PL_sv_undef with a sv_newmortal() (or non-mortal once pp_open is converted to PERL_RC_STACK).

So not simple, and doesn't fully fix it anyway.