onetrueawk / awk

One true awk
Other
1.98k stars 159 forks source link

Eliminate file management memory leak #142

Closed mpinjr closed 2 years ago

mpinjr commented 2 years ago

stdinit initializes the first three slots of the files array with statically allocated pathnames. closefile considers these slots permanently statically allocated. openfile does not.

This discrepancy can leak memory: if a statically allocated slot is vacated, openfile will put a malloc'd string where closefile does not free.

To reconcile, either openfile must not reuse these slots or stdinit and closefile must switch to dynamic allocation. I chose the latter, because the result is a little simpler and less interconnected than managing a partitioned array.

Here's a (contrived) demo with my system's output:

$ cat leak.sh
#!/bin/sh

# Close stderr, vacating files[2]
# Print the initial resident set size
# For each record, open and close a file named "y"
# Print rsz every 50,000 iterations
# Exit after 250,000 iterations

awk=${1:-nawk}
yes | $awk '
        BEGIN {
    close("/dev/stderr")
    system(rsz = "ps -p $PPID -o rsz=")
    }
    { print > $0; close($0) }
    NR % 50000 == 0 { system(rsz) }
    NR == 250000    { exit }'

# End of leak.sh

$ paste <(./leak.sh ./master) <(./leak.sh ./a.out)
 2444    2428
 4016    2500
 5576    2500
 7140    2500
 8704    2500
10264    2500

make check completed successfully.

The time you've taken to read this is appreciated, Miguel

mpinjr commented 2 years ago

For the history buffs, apparently this bug was introduced about 30 years ago by 10th Edition Research Unix [1], which first prepopulated the files array with /dev/stdin, /dev/stdout, and /dev/stderr. In SVR4, the files array is initially empty and openfile and closefile agree that all pathnames are dynamically allocated [2].

[1] https://github.com/danfuzz/one-true-awk/blob/master/versions/1992-11-28/run.c [2] https://github.com/danfuzz/one-true-awk/blob/master/versions/1989-10-11/run.c