ThomasDickey / original-mawk

bug-reports for mawk (originally on GoogleCode)
http://invisible-island.net/mawk/mawk.html
18 stars 2 forks source link

system("") does not flush output #41

Open rofl0r opened 8 years ago

rofl0r commented 8 years ago

according to the GNU awk manual, awk is required to flush its output when a system() call is executed, and so system("") serves as a neat and portable trick to force a flush of stdout.

http://gnu.huihoo.org/gawk-3.0.3/html_node/gawk_126.html

see "Controlling Output Buffering with system"

ThomasDickey commented 8 years ago

actually, I don't see in the text where gawk says this is "required", but only where it comments that

    fflush is a recent (1994) addition to the Bell Labs research version of
    awk; it is not part of the POSIX standard, and will not be available if
    `--posix' has been specified on the command line (see section Command
    Line Options).

and then

     gawk extends the fflush function in two ways.  The first is to allow
     no argument at all.  In this case, the buffer for the standard output
     is flushed.  The second way is to allow the null string ("") as the
     argument.
rofl0r commented 8 years ago

let me quote the whole paragraph here:

Controlling Output Buffering with system

The fflush function provides explicit control over output buffering for individual files and pipes. However, its use is not portable to many other awk implementations. An alternative method to flush output buffers is by calling system with a null string as its argument:

system("")   # flush output

gawk treats this use of the system function as a special case, and is smart enough not to run a shell (or other command interpreter) with the empty command. Therefore, with gawk, this idiom is not only useful, it is efficient. While this method should work with other awk implementations, it will not necessarily avoid starting an unnecessary shell. (Other implementations may only flush the buffer associated with the standard output, and not necessarily all buffered output.)

If you think about what a programmer expects, it makes sense that system should flush any pending output. The following program:

BEGIN {
     print "first print"
     system("echo system echo")
     print "second print"
}

must print

first print
system echo
second print

and not

system echo
first print
second print

If awk did not flush its buffers before calling system, the latter (undesirable) output is what you would see. 

i'm reading must print as a requirement, but you're right in that the POSIX specification[0] does not mention this requirement explicitly. however it really makes sense to implement it that way. thus the system() function should just fflush(stdout); before executing the syscall. since debian/ubuntu ship mawk as the default awk implementation, i currently need an ugly workaround to get the desired behaviour for all awk implementations in use: https://github.com/sabotage-linux/sabotage/commit/5c90662115f0c4b28df5472f0dd057c34c9f6e43 i hope you agree with my assessment. cheers.

[0] http://pubs.opengroup.org/onlinepubs/009695399/utilities/awk.html

ThomasDickey commented 8 years ago

mawk does flush all output when doing system. It doesn't provide a special case where system is used as a replacement for flush.

stephane-chazelas commented 8 years ago

What may be confusing the OP is that mawk buffers its input. Or in other words, it won't start processing its input until a buffer full has been read. It's the only utility implementation that I know that does that (and is annoying at times).

In:

$ (echo 1; sleep 2; echo 2) | mawk '1;NR == 1 {system("echo X")}'
1
X
2

You do see the output in the right order (mawk does flush its output), but you have to wait the 2 seconds to get it. That's where it differs from other awk implementaions.

rofl0r commented 8 years ago

interesting. i didn't reply yet because i analyzed the source and mawk does indeed flush all fds (including stdout) when calling system(), but still behaves as if it does not, which forces me to pass -W interactive to make mawk behave as expected.

copbint commented 3 years ago

-W interactive saved my day!

Earnestly commented 1 year ago

As a consideration it might be reasonable to make -W interactive the default operation if stdout is a terminal/tty. Having to special case mawk with this flag makes it difficult to use it portably.