aureliojargas / sedsed

Debugger and code formatter for sed scripts
https://aurelio.net/projects/sedsed/
GNU General Public License v3.0
115 stars 10 forks source link

Bug: last address `//` state is lost in this script #15

Open aureliojargas opened 10 years ago

aureliojargas commented 10 years ago

This sed script is taken from the BSD sed test-suite:

$ cat last-pattern-bug.sed 
p
s/e/X/p
:x
s//Y/p     ;# here sedsed fails to save the // state
/f/ b x
$ echo 'eeefff' | sed -f last-pattern-bug.sed 
eeefff
Xeefff
XYefff
XYeYff
XYeYYf
XYeYYY
XYeYYY

This GNU sed execution shows the expected behavior.

In the first pass it changes the first e to X and the second e to Y. Then it starts a loop (command b) until no more f is found. Inside this loop, the s//Y/p matches one f and changes it to Y.

Note that the s pattern is empty, meaning that the last matched pattern should be used. In this case, it's f, matched by the loop conditional /f/ b x. Here lies the sedsed problem, demonstrated below. When using sedsed, the last matched pattern is e instead of f, causing an infinite loop:

$ echo 'eeefff' | sedsed --debug -f last-pattern-bug.sed | head -n 30
PATT:eeefff$
HOLD:$
COMM:p
eeefff
PATT:eeefff$
HOLD:$
COMM:s/e/X/p
Xeefff
PATT:Xeefff$
HOLD:$
COMM::x
COMM:s//Y/p
XYefff
PATT:XYefff$
HOLD:$
COMM:/f/ b x
COMM:s//Y/p
XYYfff
PATT:XYYfff$
HOLD:$
COMM:/f/ b x
COMM:s//Y/p
PATT:XYYfff$
HOLD:$
COMM:/f/ b x
COMM:s//Y/p
PATT:XYYfff$
HOLD:$
COMM:/f/ b x
COMM:s//Y/p
$

It's a tricky problem to solve.

Currently, sedsed adds extra s/// debug commands that alter the "last matched pattern" state:

$ sedsed --dump-debug -e 's/foo/bar/'
        s/^/PATT:/
        l
        s/^PATT://
        x
        s/^/HOLD:/
        l
        s/^HOLD://
        x
        i\
COMM:s/foo/bar/
#--------------------------------------------------
s/foo/bar/
        s/^/PATT:/
        l
        s/^PATT://
        x
        s/^/HOLD:/
        l
        s/^HOLD://
        x

Solution A: do not use s/// commands for the debug messages. But how to show the pattern space contents with a PATT: prefix without using s?

Solution B: reset the last matched address before every command with a bogus (and harmless) command such as /address/ { ; }. BTW, I'm already doing something similar to preserve the "last substitution status" used by the t command.

Solution C: any ideas?

aureliojargas commented 10 years ago

Solution B is not correct. The last matched address is unknown before execution. In this very example, it has changed once the loop began: first it was e then f. Being a preprocessor, sedsed can't figure that out.

aureliojargas commented 5 years ago

It seems the same bug affects test/gnused/recall.sed. It's skipped in the run script.