Closed joemcmahon closed 1 year ago
I can confirm the weird behavior of the l
command in the debugger reported by the OP. However, I'm not familiar enough with lib/perl5db.pl
to locate the operative code. Can anyone else comment? @shlomif, perhaps? @tonycoz? E. Choroba?
I spent some time looking at this last week, the regexp for matching a range is overly permissive, and there's some strange processing of the matched values I don't understand.
This week I've been sick and haven't been able to follow-up on it.
The permissive regexp and the strange processing have been there long before @shlomif 's re-org of the debugger.
I will take a look this week if Tony can't get to it. I think I remember where it is from back when I wrote the comments.
On Thu, Aug 17, 2023 at 4:57 PM Tony Cook @.***> wrote:
I spent some time looking at this last week, the regexp for matching a range is overly permissive, and there's some strange processing of the matched values I don't understand.
This week I've been sick and haven't been able to follow-up on it.
The permissive regexp and the strange processing have been there long before @shlomif https://github.com/shlomif 's re-org of the debugger.
— Reply to this email directly, view it on GitHub https://github.com/Perl/perl5/issues/21350#issuecomment-1683130766, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAO6GY44MQR6AN632GMIYTXV2VWRANCNFSM6AAAAAA3JAZEZY . You are receiving this because you authored the thread.Message ID: @.***>
hi all.
No promises, but I also would like to investigate this issue
We should be able to get it between us!
On Thu, Aug 17, 2023 at 11:27 PM Shlomi Fish @.***> wrote:
hi all.
No promises, but I also would like to investigate this issue
— Reply to this email directly, view it on GitHub https://github.com/Perl/perl5/issues/21350#issuecomment-1683416152, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAO6G7K4PLF3X3VSPREQHDXV4DOLANCNFSM6AAAAAA3JAZEZY . You are receiving this because you authored the thread.Message ID: @.***>
Have fun, I'll get pinged on updates to this ticket.
In order to write a test assertion, I/we need to know what is the desirable behaviour. I support rejecting the inputs with an error. "explicit is better than implicit"
That seems reasonable.
Agreed.
On Sun, Aug 20, 2023 at 6:43 PM Tony Cook @.***> wrote:
That seems reasonable.
— Reply to this email directly, view it on GitHub https://github.com/Perl/perl5/issues/21350#issuecomment-1685494742, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAO6G6QLXRCL7LB6D2GNWLXWK4NNANCNFSM6AAAAAA3JAZEZY . You are receiving this because you authored the thread.Message ID: @.***>
Okay, finally working on this, and the bug is caused by the regex that tries to extract the list range, and is being too clever; it's trying to parse a couple different things at once, and messing up because of that.
2804 # l start-stop or l start,stop
2805 elsif (my ($s, $e) = $spec =~ /^(?:(-?[\d\$\.]+)(?:[-,]([\d\$\.]+))?)?/ ) {
2806 return _cmd_l_range($spec, $line, $s, $e);
2807 }
This is trying to handle l $ref
and the ranges for sure; it's also what's accepting the dots in the line numbers. It's also accepting lots of other things that are undocumented and don't work (like -12
, and $ref-2
and 10 -2
and 10 -12
and ..............2
). It's also what's accepting l 15.7
and l 127.0.0.1
. Update: this also rolls in l -
.
Last, the l $ref
command doesn't really work at all.
I am going to break this match up into several different matches and match the pure line-number behavior documented in perldoc perldebug
first; we can then figure out the bug that keeps l $ref
from working.
Okay, finally working on this, and the bug is caused by the regex that tries to extract the list range, and is being too clever; it's trying to parse a couple different things at once, and messing up because of that.
Joe, thanks for plugging away on this. Could you identify the relevant line numbers in the codebase so that others can look at them as well?
Thanks.
Sure, I'll edit the previous comment with that.
Scanning up from the regex in question at line 2805, several of these cases are already handled and shouldn't be matched here at all:
2780 sub _cmd_l_main {
2781 my $spec = shift;
2782
2783 # If this is '-something', delete any spaces after the dash.
2784 $spec =~ s/\A-\s*\z/-/;
2785
2786 # If the line is '$something', assume this is a scalar containing a
2787 # line number.
2788 # Set up for DB::eval() - evaluate in *user* context.
2789 if ( my ($var_name) = $spec =~ /\A(\$.*)/s ) {
2790 return _cmd_l_handle_var_name($var_name);
2791 }
2792 # l name. Try to find a sub by that name.
2793 elsif ( ($subname) = $spec =~ /\A([\':A-Za-z_][\':\w]*(?:\[.*\])?)/s ) {
2794 return _cmd_l_handle_subname();
2795 }
2796 # Bare 'l' command.
2797 elsif ( $spec !~ /\S/ ) {
2798 return _cmd_l_empty();
2799 }
2800 # l [start]+number_of_lines
2801 elsif ( my ($new_start, $new_incr) = $spec =~ /\A(\d*)\+(\d*)\z/ ) {
2802 return _cmd_l_plus($new_start, $new_incr);
2803 }
2804 # l start-stop or l start,stop
2805 elsif (my ($s, $e) = $spec =~ /^(?:(-?[\d\$\.]+)(?:[-,]([\d\$\.]+))?)?/ ) {
2806 return _cmd_l_range($spec, $line, $s, $e);
2807 }
2808
2809 return;
2810 } ## end sub _cmd_l_main
$name
, and that shouldn't be in 2805 at all.l -nnnn
, (and maybe l -nnnn, mmmm
) which is I think only valid with an attached editor. This is undocumented...and does nothing.l -
lists the whole file. Undocumented, and also goofy..
in a line number at all and I think that's flat-out erroneous...unless this is more editor support? It's possible that editor support is supposed to handle l -10,.
and things like it, but a) this is undocumented and b) including it in this regex is allowing the goofy line numbers, so if it is allowed, it should have its own handler instead.
-
(previous window) does come through 2805, but only after being transformed to l nnnn+
, so matching ^-
in 2805 is useless unless it's, again, some arcane thing for editor support..
altogether; I can't see what it is supposed to be doing -- I will post to p5p and see if there's anyone who knows.But the line 2805 regex needs to be simplified to (probably) \A(\d+)(?:[-,](\d+))?)?\z
, which is l nnnn
, l nnnn-mmmm
, and l nnnn,mmmm
. I'll test that, obviously, but the idea is that there shouldn't be any other special case in there at all: just simple line numbers, properly parsed.
I dug back through the commit history, and at dad0832b48 (Feb 2002), lines 905 to 912 show us that a line number of .
(just one dot) is supposed to be "the current line".
54d04a52ebe lib/perl5db.pl (Ilya Zakharevich 1996-02-06 15:32:09 -0500 905) $cmd =~ /^\.$/ && do {
1d06cb2d31f lib/perl5db.pl (Ilya Zakharevich 1996-12-26 14:54:34 -0500 906) $incr = -1; # for backward motion.
54d04a52ebe lib/perl5db.pl (Ilya Zakharevich 1996-02-06 15:32:09 -0500 907) $start = $line;
54d04a52ebe lib/perl5db.pl (Ilya Zakharevich 1996-02-06 15:32:09 -0500 908) $filename = $filename_ini;
8ebc5c0145d lib/perl5db.pl (Perl 5 Porters 1997-01-04 17:44:00 +1200 909) *dbline = $main::{'_<' . $filename};
54d04a52ebe lib/perl5db.pl (Ilya Zakharevich 1996-02-06 15:32:09 -0500 910) $max = $#dbline;
f1583d8f9d0 lib/perl5db.pl (Ilya Zakharevich 2001-05-18 23:49:09 -0400 911) print_lineinfo($position);
54d04a52ebe lib/perl5db.pl (Ilya Zakharevich 1996-02-06 15:32:09 -0500 912) next CMD };
d338d6fe1df lib/perl5db.pl (Perl 5 Porters 1996-01-11 22:23:31 +0000 913) $cmd =~ /^w\b\s*(\d*)$/ && do {
So these should be acceptable range specs:
l .
to list the current linel .,100
to list from "here" to line 100l 18,.
to list from line 18 to hereAnd so on. So that regex should have a case for matching .
and nothing else. Continuing to dig back into the commit history.
At that same commit, I can see that the same regex we've been using is still there. No special casing for anything other than .
.
54d04a52ebe lib/perl5db.pl (Ilya Zakharevich 1996-02-06 15:32:09 -0500 932) $cmd =~ /^l\b\s*((-?[\d\$\.]+)([-,]([\d\$\.]+))?)?/ && do {
54d04a52ebe lib/perl5db.pl (Ilya Zakharevich 1996-02-06 15:32:09 -0500 933) $end = (!defined $2) ? $max : ($4 ? $4 : $2);
d338d6fe1df lib/perl5db.pl (Perl 5 Porters 1996-01-11 22:23:31 +0000 934) $end = $max if $end > $max;
d338d6fe1df lib/perl5db.pl (Perl 5 Porters 1996-01-11 22:23:31 +0000 935) $i = $2;
d338d6fe1df lib/perl5db.pl (Perl 5 Porters 1996-01-11 22:23:31 +0000 936) $i = $line if $i eq '.';
d338d6fe1df lib/perl5db.pl (Perl 5 Porters 1996-01-11 22:23:31 +0000 937) $i = 1 if $i < 1;
1d06cb2d31f lib/perl5db.pl (Ilya Zakharevich 1996-12-26 14:54:34 -0500 938) $incr = $end - $i;
055fd3a96a4 lib/perl5db.pl (Gurusamy Sarathy 2000-03-14 05:49:08 +0000 939) if ($slave_editor) {
d338d6fe1df lib/perl5db.pl (Perl 5 Porters 1996-01-11 22:23:31 +0000 940) print $OUT "\032\032$filename:$i:0\n";
d338d6fe1df lib/perl5db.pl (Perl 5 Porters 1996-01-11 22:23:31 +0000 941) $i = $end;
d338d6fe1df lib/perl5db.pl (Perl 5 Porters 1996-01-11 22:23:31 +0000 942) } else {
d338d6fe1df lib/perl5db.pl (Perl 5 Porters 1996-01-11 22:23:31 +0000 943) for (; $i <= $end; $i++) {
1b4321cf375 lib/perl5db.pl (Ilya Zakharevich 2001-07-02 02:27:22 -0400 944) my ($stop,$action);
04e43a21088 lib/perl5db.pl (Daniel S. Lewart 2001-05-22 21:18:03 -0500 945) ($stop,$action) = split(/\0/, $dbline{$i}) if
04e43a21088 lib/perl5db.pl (Daniel S. Lewart 2001-05-22 21:18:03 -0500 946) $dbline{$i};
54d04a52ebe lib/perl5db.pl (Ilya Zakharevich 1996-02-06 15:32:09 -0500 947) $arrow = ($i==$line
54d04a52ebe lib/perl5db.pl (Ilya Zakharevich 1996-02-06 15:32:09 -0500 948) and $filename eq $filename_ini)
54d04a52ebe lib/perl5db.pl (Ilya Zakharevich 1996-02-06 15:32:09 -0500 949) ? '==>'
36477c247f3 lib/perl5db.pl (Perl 5 Porters 1996-12-06 18:56:00 +1200 950) : ($dbline[$i]+0 ? ':' : ' ') ;
54d04a52ebe lib/perl5db.pl (Ilya Zakharevich 1996-02-06 15:32:09 -0500 951) $arrow .= 'b' if $stop;
54d04a52ebe lib/perl5db.pl (Ilya Zakharevich 1996-02-06 15:32:09 -0500 952) $arrow .= 'a' if $action;
54d04a52ebe lib/perl5db.pl (Ilya Zakharevich 1996-02-06 15:32:09 -0500 953) print $OUT "$i$arrow\t", $dbline[$i];
65c9c81d8dc lib/perl5db.pl (Ilya Zakharevich 1998-10-27 20:23:27 -0500 954) $i++, last if $signal;
d338d6fe1df lib/perl5db.pl (Perl 5 Porters 1996-01-11 22:23:31 +0000 955) }
65c9c81d8dc lib/perl5db.pl (Ilya Zakharevich 1998-10-27 20:23:27 -0500 956) print $OUT "\n" unless $dbline[$i-1] =~ /\n$/;
d338d6fe1df lib/perl5db.pl (Perl 5 Porters 1996-01-11 22:23:31 +0000 957) }
d338d6fe1df lib/perl5db.pl (Perl 5 Porters 1996-01-11 22:23:31 +0000 958) $start = $i; # remember in case they want more
d338d6fe1df lib/perl5db.pl (Perl 5 Porters 1996-01-11 22:23:31 +0000 959) $start = $max if $start > $max;
d338d6fe1df lib/perl5db.pl (Perl 5 Porters 1996-01-11 22:23:31 +0000 960) next CMD; };
I'm going to jump back to before Ilya's commit that added this line and see if there was more complex code for parsing the regex results earlier.
Larry's March 1991 commit contains the same regex without the leading -
, but again, no special handling other than .
. I begin to think that the regex was coded to catch all the cases that they wanted to eventually handle, but there was a dropped stitch and the special-er cases ended up being parsed elsewhere.
I think the only special case that is actually supposed to be there is .
, and everything else was just cargo-culted forward.
fe14fcc35f7 lib/perldb.pl (Larry Wall 1991-03-21 00:00:00 +0000 260) $cmd =~ /^l\b\s*(([\d\$\.]+)([-,]([\d\$\.]+))?)?/ && do {
fe14fcc35f7 lib/perldb.pl (Larry Wall 1991-03-21 00:00:00 +0000 261) $end = (!$2) ? $max : ($4 ? $4 : $2);
fe14fcc35f7 lib/perldb.pl (Larry Wall 1991-03-21 00:00:00 +0000 262) $end = $max if $end > $max;
fe14fcc35f7 lib/perldb.pl (Larry Wall 1991-03-21 00:00:00 +0000 263) $i = $2;
fe14fcc35f7 lib/perldb.pl (Larry Wall 1991-03-21 00:00:00 +0000 264) $i = $line if $i eq '.';
fe14fcc35f7 lib/perldb.pl (Larry Wall 1991-03-21 00:00:00 +0000 265) $i = 1 if $i < 1;
f0fcb552910 lib/perldb.pl (Larry Wall 1991-11-05 06:28:06 +0000 266) if ($emacs) {
f0fcb552910 lib/perldb.pl (Larry Wall 1991-11-05 06:28:06 +0000 267) print OUT "\032\032$filename:$i:0\n";
f0fcb552910 lib/perldb.pl (Larry Wall 1991-11-05 06:28:06 +0000 268) $i = $end;
f0fcb552910 lib/perldb.pl (Larry Wall 1991-11-05 06:28:06 +0000 269) } else {
f0fcb552910 lib/perldb.pl (Larry Wall 1991-11-05 06:28:06 +0000 270) for (; $i <= $end; $i++) {
f0fcb552910 lib/perldb.pl (Larry Wall 1991-11-05 06:28:06 +0000 271) print OUT "$i:\t", $dbline[$i];
f0fcb552910 lib/perldb.pl (Larry Wall 1991-11-05 06:28:06 +0000 272) last if $signal;
f0fcb552910 lib/perldb.pl (Larry Wall 1991-11-05 06:28:06 +0000 273) }
a687059cbaf lib/perldb.pl (Larry Wall 1989-10-18 00:00:00 +0000 274) }
fe14fcc35f7 lib/perldb.pl (Larry Wall 1991-03-21 00:00:00 +0000 275) $start = $i; # remember in case they want more
fe14fcc35f7 lib/perldb.pl (Larry Wall 1991-03-21 00:00:00 +0000 276) $start = $max if $start > $max;
fe14fcc35f7 lib/perldb.pl (Larry Wall 1991-03-21 00:00:00 +0000 277) next CMD; };
This is the earliest commit where the debugger actually exists, so I think we've dug back as far as we can go.
Pattern cleaned up and passing current tests; $ was also removed.
Going to add a few more tests to verify that the formerly-accepted but invalid strings are now rejected -- there's a catchall that print "Invalid line range specification" at the end. Should have a PR tomorrow.
https://github.com/Perl/perl5/pull/21488 fixes the egregious behavior. We still have some issues that are outside the scope of this bug, and I'll open new issues for those. They're related to UTF-8 and are a lot more challenging.
Fixed by #21488
Module: perl5db.pl
Description
Accidentally entering a line range with one or more periods on a list command causes the debugger to produce anomalous line numbers
Steps to Reproduce
perl -d
any script.l 1.10
.l
command.l 1.2.3
, the debugger will list line 1 as line 1.2.3.l 127.0.0.1
will list line 127 as127.0.0.1 ...
.l
commands will not show decimal points, so the logic is detecting that floating point numbers with no decimals can be displayed as integersExpected behavior
The debugger should either reject numbers containing periods on a list command; automatically truncate them to integers (probably the best option); or interpret the first
.
as a comma, then truncate anything after a second.
, and print the resulting range of lines.Perl configuration