onetrueawk / awk

One true awk
Other
1.98k stars 159 forks source link

Fix error handling in closefile and closeall #139

Closed mpinjr closed 2 years ago

mpinjr commented 2 years ago

printstat and awkprintf are very clear: print statement errors are fatal.

In Jan 2020, to prevent fatal print errors from masquerading as fclose warnings, #63 changed every WARNING in closefile and closeall to FATAL. This broke awk's close and getline functions.

close no longer returns if there's an error, unless the stream doesn't exist.

getline read errors still return -1, but they are no longer ignorable. Eventually, one of the closing functions will inspect the stream with ferror and call FATAL.

In Jul 2020, fatal stdout write errors which had been detectable by closefile for a few months became invisible, a consequence of #89 switching standard streams from fclose (which reports flush errors) to freopen (which ignores them). The earlier changes which broke getline and close were themselves partially broken.

The solution is to finish printing before closing. That is to flush and ferror every stream opened for writing before calling fclose, pclose, or freopen. A failure to write print statement data is fatal. A failure to close a flushed stream is a warning. They must be handled separately.

Every redirected print statement is finished in printstat or awkprintf.

The same is not true of unredirected print statements. To finish these, stdout must be flushed at some point after the final such statement. Any problem with that flush is fatal.

Though only stdout needs it, let's defensively finish every stream opened for writing, so this bug won't recur if someone changes how redirected streams are flushed.

Write errors on stderr by the implementation are never fatal. When closing, we only warn of them. Write errors from an application attempting a redirected print to /dev/stderr are as immediately fatal as every other redirected print statement.

These changes compiled cleanly and I didn't see any errors in make check.

Below, I've included some test results along with the test script.

Each test is an awk one-liner sabotaged with strace.

The first line of a test case report consists of the syscall that was sabotaged and its program. Then, for each executable tested, there's a line with its return value and path. If this line begins with an asterisk, the executable failed the test.

Tests are paired, with one closing explicitly (closefile) and the other implicitly (closeall).

AWK Versions Tested

pr63~1:         Parent of pull request 63 (2976507)
pr63.out:       Pull request 63 (fed1a56)
pr89+3days.out: Move to bison (07f0438), 3 days after pr89 
master.out:     Current master (01749f0)
a.out:          Current master with my commits
# TEST RESULTS
#-----------------------------------------------------------------------
# Sabotaging the closing of a healthy stream shouldn't be fatal

close   'BEGIN {getline < "/dev/null"}'
 0  ./pr63~1.out
*2  ./pr63.out
*2  ./pr89+3days.out
*2  ./master.out
 0  ./a.out

close   'BEGIN {getline < "/dev/null"; close("/dev/null")}'
 0  ./pr63~1.out
*2  ./pr63.out
*2  ./pr89+3days.out
*2  ./master.out
 0  ./a.out

#-----------------------------------------------------------------------
# Closing a stream after a getline read error shouldn't be fatal

read    'BEGIN {getline < "/dev/null"}'
 0  ./pr63~1.out
*2  ./pr63.out
*2  ./pr89+3days.out
*2  ./master.out
 0  ./a.out

read    'BEGIN {getline < "/dev/null"; close("/dev/null")}'
 0  ./pr63~1.out
*2  ./pr63.out
*2  ./pr89+3days.out
*2  ./master.out
 0  ./a.out

#-----------------------------------------------------------------------
# Redirected print statement errors should always be fatal

write   'BEGIN {print > "/dev/null"}'
 2  ./pr63~1.out
 2  ./pr63.out
 2  ./pr89+3days.out
 2  ./master.out
 2  ./a.out

write   'BEGIN {print > "/dev/null"; close("/dev/null")}'
 2  ./pr63~1.out
 2  ./pr63.out
 2  ./pr89+3days.out
 2  ./master.out
 2  ./a.out

#-----------------------------------------------------------------------
# Unredirected print statement errors should always be fatal

write   'BEGIN {print}' > /dev/null
*0  ./pr63~1.out
 2  ./pr63.out
 2  ./pr89+3days.out
 2  ./master.out
 2  ./a.out

write   'BEGIN {print; close("/dev/stdout")}' > /dev/null
*0  ./pr63~1.out
 2  ./pr63.out
*0  ./pr89+3days.out
*0  ./master.out
 2  ./a.out

That test output was produced with this command and script:

$ ./ioerr-tests.sh ./pr63~1.out ./pr63.out ./pr89+3days.out ./master.out ./a.out 2>/dev/null
#!/bin/sh

# A test is an awk one-liner sabotaged with strace.

# TEST SPECIFICATION FORMAT
#
# ----------------------------------------------------------------
# expected-return-value <TAB> sabotaged-syscalls <TAB> awk-program
# ----------------------------------------------------------------
#
# expected-return-value:
# 0 - awk-program should return 0.
# 1 - awk-program should return non-zero.
#
# syscalls-to-sabotage:
# A comma-delimited list to pass to strace. An attempt to use
# these syscalls on /dev/null will fail.
#
# awk-program:
# A quoted string, optionally followed by file operands and sh
# redirections.
#
# Empty lines and lines beginning with '#' are echoed.

# TEST OUTPUT FORMAT
#
# -----------------------------------------------------------
# sabotaged-syscalls <TAB> awk-program
# error-flag actual-return-value <TAB> path-to-awk-executable
# -----------------------------------------------------------
#
# A leading '*' indicates a defiant return value.

# Tests are paired, with one closing explicitly (closefile) and the
# other implicitly (closeall).

exec <<'END_OF_TESTS'
#-----------------------------------------------------------------------
# Sabotaging the closing of a healthy stream shouldn't be fatal

0   close   'BEGIN {getline < "/dev/null"}'
0   close   'BEGIN {getline < "/dev/null"; close("/dev/null")}'

#-----------------------------------------------------------------------
# Closing a stream after a getline read error shouldn't be fatal

0   read    'BEGIN {getline < "/dev/null"}'
0   read    'BEGIN {getline < "/dev/null"; close("/dev/null")}'

#-----------------------------------------------------------------------
# Redirected print statement errors should always be fatal

1   write   'BEGIN {print > "/dev/null"}'
1   write   'BEGIN {print > "/dev/null"; close("/dev/null")}'

#-----------------------------------------------------------------------
# Unredirected print statement errors should always be fatal

1   write   'BEGIN {print}' > /dev/null
1   write   'BEGIN {print; close("/dev/stdout")}' > /dev/null
END_OF_TESTS

while IFS=' ' read -r return01 syscalls awkprog; do
    # echo comments and empty lines
    if [ -z "${return01%%#*}" ]; then
        echo "$return01"
        continue
    fi

    echo "$syscalls $awkprog"
    for awk in "$@"; do
        eval strace -o /dev/null -P /dev/null -e fault=\$syscalls \
            \$awk "$awkprog"
        return=$?
        r01=$return
        [ $r01 -ne 0 ] && r01=1
        err=' '
        [ $r01 -ne $return01 ] && err='*'
        echo "$err$return   $awk"
    done
    echo
done

Lastly, here's how the behavior of the test supplied in #63 has changed over the past couple of years. I've paired it with an explicitly closing counterpart. The correct answer for both is 2. Command line, script, and results follow.

./sigpipe-tests.sh ./pr63~1.out ./pr63.out ./pr89+3days.out ./master.out ./a.out 2>/dev/null
#!/bin/sh

# From https://github.com/onetrueawk/awk/pull/63

exec 3>&1
trap '' PIPE
for awk in "$@"; do
    echo $awk
    { $awk 'BEGIN { print "hi"; }'; echo "E $?" >&3; } | :
    { $awk 'BEGIN { print "hi"; close("/dev/stdout")}'; echo "E $?" >&3; } | :
done
./pr63~1.out
E 0
E 0
./pr63.out
E 2
E 2
./pr89+3days.out
E 2
E 0
./master.out
E 2
E 0
./a.out
E 2
E 2

If you made it all the way down here, I really appreciate it.

Take care, Miguel

plan9 commented 2 years ago

very interesting, thank you for extensive tests.

arnoldrobbins commented 2 years ago

This patch is a big step in the right direction. The only thing I don't like is that you fflush all files, including those that are open for input, which is a meaningless operation. I don't know what effect that has on ferror. I suggest that both functions be more careful to only flush output files and pipes. Thanks.

mpinjr commented 2 years ago

I suggest that both functions be more careful to only flush output files and pipes.

Agreed. I'll make the changes, rerun the tests, and push (sometime in the next few hours).