Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.95k stars 554 forks source link

Using signatures makes the debugger harder to use #22602

Open Ovid opened 1 month ago

Ovid commented 1 month ago

Description

When using the debugger (perl -d $program), you can use the s command to step through lines of code, one at a time. However, this is not true for subroutine signatures. When you step through one of these subs, you have to step through the signature at N+1 times, where N is the number of variables in the signature. These extra steps are not obvious, because nothing visible is happening, and they can be confusing. More than once I've accidentally stepped over the first line of a sub/method because I hit "return" too many times on a signatures.

This makes using the already cumbersome debugger CLI even more cumbersone.

It does not affect subroutines without signatures, even if they have a prototype (since the latter is compile-time, not run time).

This appears to affect all versions of Perl which allow signatures, whether by default, or when using the experimental signatures feature.

Steps to Reproduce

Run the following program with perl -d:

#!/usr/bin/env perl

use v5.40.0;

sub no_signature { # we will not step into this line
    say "No signature was used.";
}

sub with_prototype :prototype($$;$) { # we will not step into this line
    say "This is a subroutine with a prototype.";
}

sub no_args() { # we will step into this line 1 time
    say "No arguments were passed.";
}

sub one_arg($one) { # we will step into this line 2 times
    say "One argument was passed: $one";
}

sub two_args ( $one, $two ) { # we will step into this line 3 times
    say "Two arguments were passed: $one and $two";
}

sub three_args ( $one, $two, $three ) { # we will step into this line 4 times
    say "Three arguments were passed: $one, $two, and $three";
}

sub four_args ( $one, $two, $three, $four ) { # we will step into this line 5 times
    say
      "Four arguments were passed: $one, $two, $three, and $four";
}

no_signature();
with_prototype( 'one', 'two' );
no_args();
one_arg('one');
two_args( 'one', 'two' );
three_args( 'one', 'two', 'three' );
four_args( 'one', 'two', 'three', 'four' );

Expected behavior

When running the above program with the debugger, when you try to step into the first sub, you immediately get into the body of the sub:

5   sub no_signature {
6==>        say "No signature was used.";
7   }

I expect the debugger to never stop on the signature line, or at minimum, to only stop once (because we might need to see the defaults).

However, for all subs with signatures, you must repeat the s (or just hit "return" to reuse the last step command) N+1 times for every variable in the signature, even though it's not obvious to the programmer that runtime behavior is occurring here:

9==>    sub empty_signature () {
10:     say "Empty signature was used.";
11  }

However, you can verify runtime behavior if you print out the value of those variables after every step:

main::(subs.pl:43): two_args( 'one', 'two' );
  DB<1> s
main::two_args(subs.pl:25): sub two_args ( $one, $two ) {
  DB<1> x [$one,$two]
0  ARRAY(0x15385afe0)
   0  undef
   1  undef
  DB<2> s
main::two_args(subs.pl:25): sub two_args ( $one, $two ) {
  DB<2> x [$one,$two]
0  ARRAY(0x153074888)
   0  undef
   1  undef
  DB<3> s
main::two_args(subs.pl:25): sub two_args ( $one, $two ) {
  DB<3> x [$one,$two]
0  ARRAY(0x1530748b8)
   0  'one'
   1  undef
  DB<4> s
main::two_args(subs.pl:25): sub two_args ( $one, $two ) {
  DB<4> x [$one,$two]
0  ARRAY(0x15385b3e8)
   0  'one'
   1  'two'
  DB<5> s
main::two_args(subs.pl:26):     say "Two arguments were passed: $one and $two";

That being said, I've never felt the need to dump signature variables while they're being individually assigned.

Perl configuration

Summary of my perl5 (revision 5 version 40 subversion 0) configuration:

  Platform:
    osname=darwin
    osvers=23.5.0
    archname=darwin-2level
    uname='darwin ovids-m1-laptop.local 23.5.0 darwin kernel version 23.5.0: wed may 1 20:12:58 pdt 2024; root:xnu-10063.121.3~5release_arm64_t6000 arm64 '
    config_args='-de -Dprefix=/Users/ovid/perl5/perlbrew/perls/perl-5.40.0 -Aeval:scriptdir=/Users/ovid/perl5/perlbrew/perls/perl-5.40.0/bin'
    hint=recommended
    useposix=true
    d_sigaction=define
    useithreads=undef
    usemultiplicity=undef
    use64bitint=define
    use64bitall=define
    uselongdouble=undef
    usemymalloc=n
    default_inc_excludes_dot=define
  Compiler:
    cc='cc'
    ccflags ='-fno-common -DPERL_DARWIN -mmacosx-version-min=14.5 -DNO_THREAD_SAFE_QUERYLOCALE -DNO_POSIX_2008_LOCALE -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -I/opt/local/include'
    optimize='-O3'
    cppflags='-fno-common -DPERL_DARWIN -mmacosx-version-min=14.5 -DNO_THREAD_SAFE_QUERYLOCALE -DNO_POSIX_2008_LOCALE -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -I/opt/local/include'
    ccversion=''
    gccversion='Apple LLVM 15.0.0 (clang-1500.3.9.4)'
    gccosandvers=''
    intsize=4
    longsize=8
    ptrsize=8
    doublesize=8
    byteorder=12345678
    doublekind=3
    d_longlong=define
    longlongsize=8
    d_longdbl=define
    longdblsize=8
    longdblkind=0
    ivtype='long'
    ivsize=8
    nvtype='double'
    nvsize=8
    Off_t='off_t'
    lseeksize=8
    alignbytes=8
    prototype=define
  Linker and Libraries:
    ld='cc'
    ldflags =' -mmacosx-version-min=14.5 -fstack-protector-strong -L/usr/local/lib -L/opt/local/lib'
    libpth=/usr/local/lib /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/15.0.0/lib /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/lib /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib /opt/local/lib /usr/lib
    libs=-lgdbm
    perllibs=
    libc=
    so=dylib
    useshrplib=false
    libperl=libperl.a
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_dlopen.xs
    dlext=bundle
    d_dlsymun=undef
    ccdlflags=' '
    cccdlflags=' '
    lddlflags=' -mmacosx-version-min=14.5 -bundle -undefined dynamic_lookup -L/usr/local/lib -L/opt/local/lib -fstack-protector-strong'

Characteristics of this binary (from libperl): 
  Compile-time options:
    HAS_LONG_DOUBLE
    HAS_STRTOLD
    HAS_TIMES
    PERLIO_LAYERS
    PERL_COPY_ON_WRITE
    PERL_DONT_CREATE_GVSV
    PERL_HASH_FUNC_SIPHASH13
    PERL_HASH_USE_SBOX32
    PERL_MALLOC_WRAP
    PERL_OP_PARENT
    PERL_PRESERVE_IVUV
    PERL_USE_SAFE_PUTENV
    USE_64_BIT_ALL
    USE_64_BIT_INT
    USE_LARGE_FILES
    USE_LOCALE
    USE_LOCALE_COLLATE
    USE_LOCALE_CTYPE
    USE_LOCALE_NUMERIC
    USE_LOCALE_TIME
    USE_PERLIO
    USE_PERL_ATOF
  Built under darwin
  Compiled at Jul  2 2024 09:26:02
  %ENV:
    PERL5LIB="/Users/ovid/perlcustom/lib"
    PERLBREW_HOME="/Users/ovid/.perlbrew"
    PERLBREW_MANPATH="/Users/ovid/perl5/perlbrew/perls/perl-5.40.0/man"
    PERLBREW_PATH="/Users/ovid/perl5/perlbrew/bin:/Users/ovid/perl5/perlbrew/perls/perl-5.40.0/bin"
    PERLBREW_PERL="perl-5.40.0"
    PERLBREW_ROOT="/Users/ovid/perl5/perlbrew"
    PERLBREW_SHELLRC_VERSION="0.98"
    PERLBREW_VERSION="0.98"
  @INC:
    /Users/ovid/perlcustom/lib
    /Users/ovid/perl5/perlbrew/perls/perl-5.40.0/lib/site_perl/5.40.0/darwin-2level
    /Users/ovid/perl5/perlbrew/perls/perl-5.40.0/lib/site_perl/5.40.0
    /Users/ovid/perl5/perlbrew/perls/perl-5.40.0/lib/5.40.0/darwin-2level
    /Users/ovid/perl5/perlbrew/perls/perl-5.40.0/lib/5.40.0
tonycoz commented 1 month ago

state ops (nextstate, dbstate) are generated for each argument, which is what the debugger uses for stepping:

$ ./perl -Ilib -MO=Concise,f -e 'use v5.40.0; sub f ($x, $y) { print $x, $y }'
main::f:
a  <1> leavesub[1 ref] K/REFC,1 ->(end)
-     <@> lineseq KP ->a
-        <1> ex-argcheck vK/1 ->7
-           <@> lineseq vK ->-
1              <;> nextstate(main 6 -e:1) v:us,*,&,$,fea=8 ->2
2              <+> argcheck(2,0) v ->3
3              <;> nextstate(main 4 -e:1) v:us,*,&,$,fea=8 ->4
4              <+> argelem(0)[$x:4,6] v/SV ->5
5              <;> nextstate(main 5 -e:1) v:us,*,&,$,fea=8 ->6
6              <+> argelem(1)[$y:5,6] v/SV ->7
-              <;> ex-nextstate(main 6 -e:1) v:us,*,&,$,fea=8 ->-
7        <;> nextstate(main 6 -e:1) v:us,*,&,$,fea=8 ->8
9        <@> print sK ->a
8           <0> padrange[$x:4,6; $y:5,6] /range=2 ->9
-           <0> padsv[$x:4,6] s ->-
-           <0> padsv[$y:5,6] s ->9
-e syntax OK

These appear to be deliberate, both sigscalarelem and sigslurpsigil in perly.y both add state ops.

@iabyn might know why.

iabyn commented 1 month ago

On Tue, Sep 17, 2024 at 06:53:04PM -0700, Tony Cook wrote:

state ops (nextstate, dbstate) are generated for each argument, which is what the debugger uses for stepping:

These appear to be deliberate, both sigscalarelem and sigslurpsigil in perly.y both add state ops.

@iabyn might know why.

The original signature implementation (largely by Zefram) just planted a lot of standard perl ops to process signatures as if the signature handling had actually been written in perl. So for example:

sub f ($a, $b = 1) {}

deparsed as:

sub f { die sprintf("Too many arguments for subroutine at %s line %d.\n", (caller)[1, 2]) unless @ <= 2; die sprintf("Too few arguments for subroutine at %s line %d.\n", (caller)[1, 2]) unless @ >= 1; my $a = $[0]; my $b = @ >= 2 ? $_[1] : 1;

In that implementation, each item of parameter processing was separated by a nextstate op, which (among other things) ensured that the taint state was reset after each assignment (so one tainted arg wouldn't taint subsequent parameter values), and line numbers for warnings etc would be correct(ish).

When I added the sigFOO ops, I kept the nextstate ops: mainly in an attempt to preserve current behaviour as much as possible. I always intended that such sigFOO ops would be an intermediate step, and they would all be replaced by a single OP_SIGNATURE op in the peephole optimiser which would (via a mini state machine) do all the sig processing in a single op, with any taint resetting etc done by the OP_SIGNATURE, with no OP_NEXTSTATEs left.

I don't think I ever considered how the debugger would be affected.

-- Standards (n). Battle insignia or tribal totems.

leonerd commented 1 month ago

While I was looking at how to implement named params, I had vague thoughts in mind for neatening up the signature things overall; much as Dave suggests there. In the longterm I don't imagine there'll be as many OP_NEXTSTATE (or DBSTATE) ops involved.

In the short-term, I wonder if we could try simply not emitting so many of them. Since conceptually every defaulting expression for optional params does execute like a statement, those ought to be retained, but there's probably no need to have one on the OP_ARGCHECK nor the mandatory params without a defaulting expression.

Failing that, there's likely a spare private bit on the OP_DBSTATE that we can use to signal to the debugger not to count that one, so as to preserve the tree-shaped structure for code that wants to inspect the optree, but allowing the debugger to skip over those.

tonycoz commented 1 month ago

There was some discussion in IRC covering this earlier: