Inconsistent block scoping

richardleach commented 4 months ago

Description Given an if/else branch, such as the example below, it seems like a reasonable expectation that both branches behave identically with regard to entering a new scope - or not doing so.

sub foo {
    my $x = $_[0];
    if ($x) {
        return 1
    } else {
        return 0
    }
}

But they don't. The else branch is wrapped in an ENTER/LEAVE pair, but the if branch is not.

5        <;> nextstate(main 3 -e:1) v ->6
-        <1> null K/1 ->-
7           <|> cond_expr(other->8) K/1 ->c
6              <0> padsv[$x:2,9] s ->7
-              <@> scope K ->-
-                 <;> ex-nextstate(main 5 -e:1) v ->8
a                 <@> return K ->b
8                    <0> pushmark s ->9
9                    <$> const[IV 1] s ->a
h              <@> leave KP ->b
c                 <0> enter ->d
d                 <;> nextstate(main 7 -e:1) v ->e
g                 <@> return K ->h
e                    <0> pushmark s ->f
f                    <$> const[IV 0] s ->g

That's not always the behaviour. In the following example, both branches are wrapped in an ENTER/LEAVE pair:

$ perl -MO=Concise,foo -e 'sub foo { $x = $_[0]; if ($x) { local $x = 2 } else { local $x = 3 }}'
main::foo:
...
5        <;> nextstate(main 2 -e:1) v:{ ->6
-        <1> null KP/1 ->-
7           <|> cond_expr(other->8) K/1 ->f
-              <1> ex-rv2sv sK/1 ->7
6                 <#> gvsv[*x] s ->7
d              <@> leave KP ->e
8                 <0> enter ->9
9                 <;> nextstate(main 4 -e:1) v:{ ->a
c                 <2> sassign sKS/2 ->d
a                    <$> const[IV 2] s ->b
-                    <1> ex-rv2sv sKRM*/LVINTRO,1 ->c
b                       <#> gvsv[*x] s/LVINTRO ->c
k              <@> leave KP ->e
f                 <0> enter ->g
g                 <;> nextstate(main 6 -e:1) v:{ ->h
j                 <2> sassign sKS/2 ->k
h                    <$> const[IV 3] s ->i
-                    <1> ex-rv2sv sKRM*/LVINTRO,1 ->j
i                       <#> gvsv[*x] s/LVINTRO ->j
-e syntax OK

The inconsistency seems undesirable because:

If a branch should enter a new scope but does not, buggy behaviour may result
If a branch needlessly enters a new scope, that harms performance

Steps to Reproduce

perl -MO=Concise,foo -e 'sub foo { my $x = $_[0]; if ($x) { return 1 } else { return 0 }}'

Expected behavior Both branches have consistent scope behaviour.

Perl configuration Standard blead, v5.36

richardleach commented 4 months ago

The difference first traces back to _Perl_opscope. Within that function, dumping OP *o for the true case block shows OPf_PARENS set on the lineseq OP:

1    lineseq LISTOP(0x5564808c46c0) ===> [0x0]
     PARENT ===> [0x0]
     FLAGS = (UNKNOWN,KIDS,PARENS,SLABBED)
     |   
2    +--nextstate COP(0x5564808c4700) ===> [SELF]
     |   FLAGS = (VOID,SLABBED,MORESIB)
     |   LINE = 1
     |   PACKAGE = "main"
     |   SEQ = 4294967252
     |   
3    +--return LISTOP(0x5564808c4760) ===> [0x0]
         FLAGS = (UNKNOWN,KIDS,SLABBED)
         |   
4        +--pushmark OP(0x5564808c47a0) ===> [SELF]
         |   FLAGS = (SCALAR,SLABBED,MORESIB)
         |   
5        +--const SVOP(0x5564808c47d0) ===> [SELF]
             FLAGS = (SCALAR,SLABBED)
             SV = IV(0)

For the else block, OPf_PARENS is not set on the lineseq OP:

6    lineseq LISTOP(0x5564808c4808) ===> [0x0]
     PARENT ===> [0x0]
     FLAGS = (UNKNOWN,KIDS,SLABBED)
     |   
7    +--nextstate COP(0x5564808c4848) ===> [SELF]
     |   FLAGS = (VOID,SLABBED,MORESIB)
     |   LINE = 1
     |   PACKAGE = "main"
     |   SEQ = 4294967250
     |   
8    +--return LISTOP(0x5564808c48a8) ===> [0x0]
         FLAGS = (UNKNOWN,KIDS,SLABBED)
         |   
9        +--pushmark OP(0x5564808c48e8) ===> [SELF]
         |   FLAGS = (SCALAR,SLABBED,MORESIB)
         |   
10       +--const SVOP(0x5564808c4918) ===> [SELF]
             FLAGS = (SCALAR,SLABBED)
             SV = IV(1)

The presence or absence of OPf_PARENS here controls whether o is wrapped in enter/leave, or the lineseq OP is converted into a scope OP.

richardleach commented 4 months ago

The callers to _Perl_opscope are:

Perl_yyparse (gramtype=gramtype@entry=258) at /home/rich/github/perl5/perly.y:702

/* else and elsif blocks */
else
        :       empty
        |       KW_ELSE mblock
                        {
                          ($mblock)->op_flags |= OPf_PARENS;
                          $$ = op_scope($mblock);
                        }

Perl_yyparse (gramtype=gramtype@entry=258) at /home/rich/github/perl5/perly.y:466

        |       KW_IF PERLY_PAREN_OPEN remember mexpr PERLY_PAREN_CLOSE mblock else
                        {
                          $$ = block_end($remember,  
                              newCONDOP(0, $mexpr, op_scope($mblock), $else));
                          parser->copline = (line_t)$KW_IF;
                        }

I have no understanding of the parser logic and don't know why ($mblock)->op_flags |= OPf_PARENS; isn't present in both locations.

richardleach commented 4 months ago

Here's a simple example demonstrating this:

package foo;
sub new {
    return bless {}, 'foo';
}

sub DESTROY {
    $main::status = "Finished";
}

package main;
for (0..1) {
    $status = "Started";
    print $status, "\n";
    if ($_) {
        my $magic = foo->new();
    } else {
        my $magic = foo->new();
    }
    print "  $status\n";
}
print "$status\n";

A developer reasoning about this Perl code might expect the output to be:

Started
  Finished
Started
  Finished
Finished

whereas it actually outputs this:

Started
  Finished
Started
  Started
Finished

because the destructor isn't called at the expected time when the true branch of the if statement is followed.

Perl / perl5

Inconsistent block scoping #22204