Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.85k stars 527 forks source link

duplicate labels in the same block should be errors; duplicate labels in the branches of a conditional are not independent #22088

Open djerius opened 1 month ago

djerius commented 1 month ago

Description

Two issues, one documentation, one behavior.

Documentation: perlsyn doesn't document the scope of labels and what happens if a label is redefined.

Behavior

Here's what happens if the loop labels are in the same scope:

#! /usr/bin/env perl
use v5.10;

my $x = 0;
my $LOOP;

LOOP: {
    say defined $LOOP
      ? "goto $LOOP, got LOOP 1"
      : 'Passing through LOOP 1';
}

$LOOP = 'LOOP 1';
goto LOOP if $x++ == 0;

undef $LOOP;
LOOP: {
    say defined $LOOP
      ? "goto $LOOP, got LOOP 2"
      : 'Passing through LOOP 2';
}
$LOOP = 'LOOP 2';
goto LOOP if $x++ == 2;
$ perl docbug.pl
Passing through LOOP 1
goto LOOP 1, got LOOP 1
Passing through LOOP 2
goto LOOP 2, got LOOP 1
Passing through LOOP 2

Looks like the repeated label is ignored. That's unexpected. It should be an error if one shouldn't repeat labels.

What if they are in separate blocks?

#! /usr/bin/env perl
use v5.10;

my $x = 0;
my $LOOP;

{
  LOOP: {
        say defined $LOOP
          ? "goto $LOOP, got LOOP 1"
          : 'Passing through LOOP 1';
    }

    $LOOP = 'LOOP 1';
    goto LOOP if $x++ == 0;
}

undef $LOOP;

{
  LOOP: {
        say defined $LOOP
          ? "goto $LOOP, got LOOP 2"
          : 'Passing through LOOP 2';
    }
    $LOOP = 'LOOP 2';
    goto LOOP if $x++ == 2;
}
$ perl docbug.pl
Passing through LOOP 1
goto LOOP 1, got LOOP 1
Passing through LOOP 2
goto LOOP 2, got LOOP 2

Looks like they are treated as separate labels, which is what I'd expect.

What if they are declared in blocks in a conditional?

#! /usr/bin/env perl
use v5.10;

for my $cond ( 0 .. 1 ) {
    my $x = 0;
    my $LOOP;

    if ( 1 - $cond ) {
      LOOP: {
            say defined $LOOP
              ? "goto $LOOP, got LOOP 1"
              : 'Passing through LOOP 1';
            $LOOP = 'LOOP 1';
            goto LOOP if $x++ == 0;
        }
    }

    elsif ( $cond ) {
      LOOP: {
            say defined $LOOP
              ? "goto $LOOP, got LOOP 2"
              : 'Passing through LOOP 2';
            $LOOP = 'LOOP 2';
            goto LOOP if $x++ == 0;
        }
    }
}
$ perl docbugif.pl
Passing through LOOP 1
goto LOOP 1, got LOOP 1
Passing through LOOP 2
goto LOOP 2, got LOOP 1

Oops; the second one is ignored, even though it's in a new block. That is definitely not what I expected. I expected this situation to mirror the second example.

Note that if the first branch of the conditional is optimized out (see below), the first label isn't defined, and the second branch sees its own label. This means the optimization is changing the behavior of code outside of the optimized out part, which shouldn't happen. I think the above behavior is a bug.

#! /usr/bin/env perl
use v5.10;
use strict;
use warnings;

for my $cond ( 0 .. 1 ) {
    my $x = 0;
    my $LOOP;

    if ( 0 ) {
      LOOP: {
            say defined $LOOP
              ? "goto $LOOP, got LOOP 1"
              : 'Passing through LOOP 1';
            $LOOP = 'LOOP 1';
            goto LOOP if $x++ == 0;
        }
    }

    elsif ( $cond ) {
      LOOP: {
            say defined $LOOP
              ? "goto $LOOP, got LOOP 2"
              : 'Passing through LOOP 2';
            $LOOP = 'LOOP 2';
            goto LOOP if $x++ == 0;
        }
    }
}
$ perl docbugif.pl
Passing through LOOP 2
goto LOOP 2, got LOOP 2

Expected behavior

Perl configuration

Summary of my perl5 (revision 5 version 38 subversion 2) configuration:

  Platform:
    osname=linux
    osvers=6.1.0-16-amd64
    archname=x86_64-linux
    uname='linux leafhopper 6.1.0-16-amd64 #1 smp preempt_dynamic debian 6.1.67-1 (2023-12-12) x86_64 gnulinux '
    config_args='-Dprefix=/home/dj/.plenv/versions/5.38.2 -de -Dversiononly -A'eval:scriptdir=/home/dj/.plenv/versions/5.38.2/bin''
    hint=recommended
    useposix=true
    d_sigaction=define
    useithreads=undef
    usemultiplicity=undef
    use64bitint=define
    use64bitall=define
    uselongdouble=undef
    usemymalloc=n
    default_inc_excludes_dot=define
  Compiler:
    cc='cc'
    ccflags ='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2'
    optimize='-O2'
    cppflags='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include'
    ccversion=''
    gccversion='12.2.0'
    gccosandvers=''
    intsize=4
    longsize=8
    ptrsize=8
    doublesize=8
    byteorder=12345678
    doublekind=3
    d_longlong=define
    longlongsize=8
    d_longdbl=define
    longdblsize=16
    longdblkind=3
    ivtype='long'
    ivsize=8
    nvtype='double'
    nvsize=8
    Off_t='off_t'
    lseeksize=8
    alignbytes=8
    prototype=define
  Linker and Libraries:
    ld='cc'
    ldflags =' -fstack-protector-strong -L/usr/local/lib'
    libpth=/usr/local/lib /usr/lib/x86_64-linux-gnu /usr/lib /usr/lib64
    libs=-lpthread -ldl -lm -lcrypt -lutil -lc
    perllibs=-lpthread -ldl -lm -lcrypt -lutil -lc
    libc=/lib/x86_64-linux-gnu/libc.so.6
    so=so
    useshrplib=false
    libperl=libperl.a
    gnulibc_version='2.36'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs
    dlext=so
    d_dlsymun=undef
    ccdlflags='-Wl,-E'
    cccdlflags='-fPIC'
    lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector-strong'

Characteristics of this binary (from libperl): 
  Compile-time options:
    HAS_LONG_DOUBLE
    HAS_STRTOLD
    HAS_TIMES
    PERLIO_LAYERS
    PERL_COPY_ON_WRITE
    PERL_DONT_CREATE_GVSV
    PERL_HASH_FUNC_SIPHASH13
    PERL_HASH_USE_SBOX32
    PERL_MALLOC_WRAP
    PERL_OP_PARENT
    PERL_PRESERVE_IVUV
    PERL_USE_SAFE_PUTENV
    USE_64_BIT_ALL
    USE_64_BIT_INT
    USE_LARGE_FILES
    USE_LOCALE
    USE_LOCALE_COLLATE
    USE_LOCALE_CTYPE
    USE_LOCALE_NUMERIC
    USE_LOCALE_TIME
    USE_PERLIO
    USE_PERL_ATOF
  Built under linux
  Compiled at Jan 18 2024 17:41:48
  %ENV:
    PERL_CPANM_OPT="--cascade-search --mirror-only  --mirror /home/dj/Work/darkpan --mirror http://www.cpan.org"
    PERL_CPAN_MIRROR_TINY_BASE="/home/dj/Work/darkpan"
    PERL_DARKPAN="/home/dj/Work/darkpan"
  @INC:
    /home/dj/.plenv/versions/5.38.2/lib/perl5/site_perl/5.38.2/x86_64-linux
    /home/dj/.plenv/versions/5.38.2/lib/perl5/site_perl/5.38.2
    /home/dj/.plenv/versions/5.38.2/lib/perl5/5.38.2/x86_64-linux
    /home/dj/.plenv/versions/5.38.2/lib/perl5/5.38.2
tonycoz commented 1 month ago

Duplicate labels in the same scope are commonly used in tests:

SKIP:
{
   skip "I can't test this", 1 if $cant_test_this;
  ...
}
SKIP:
{
   skip "I can't test that", 1 if $cant_test_that;
  ...
}

so we can't just warn on duplicate labels in the same scope.

This means the optimization is changing the behavior of code outside of the optimized out part, which shouldn't happen.

This is documented in perlsyn and perlfunc:

... It also can't be used to go into a construct that is optimized away. ...

Of course, there's no requirement that the label is defined in the function containing the goto:

tony@venus:.../git/perl6$ cat ../22088b.pl
use v5.38;
sub f {
  say "f";
  goto X;
}

f();

goto Y;

X: { say "X1"; }
goto Y;

X: { say "X2"; }

Y:

{
  f();
 X: say "X3";
}
tony@venus:.../git/perl6$ ./perl -Ilib ../22088b.pl 
f
X1
f
X3

I do think how this is resolved could be better documented.

djerius commented 1 month ago

Duplicate labels in the same scope are commonly used in tests:

SKIP:
{
   skip "I can't test this", 1 if $cant_test_this;
  ...
}
SKIP:
{
   skip "I can't test that", 1 if $cant_test_that;
  ...
}

so we can't just warn on duplicate labels in the same scope.

Darn. Is there a way to differentiate between goto and the loop control statements that must be used within the block they address (skip uses last under the hood)? It'd have to be a run-time error, as goto can take an expression; maybe a flag on a label indicating it's multiply defined so that when goto uses it it knows it's degenerate.

This is documented in perlsyn and perlfunc:

... It also can't be used to go into a construct that is optimized away. ...

Well, so much for my reading comprehension. Thanks for pointing that bit out, somehow I glossed over that. Since there's no official "guide to what gets optimized away" perhaps that should be written in bold and prefaced with "Here be dragons".

One documentation tweak might be to suggest use of redo rather than goto when jumping to the beginning of a labeled block from within it. I've modified my code to do that, so the multiple definition issue is resolved.