onetrueawk / awk

One true awk
Other
1.98k stars 159 forks source link

Fix memory leak in string assignment to built-ins #146

Closed mpinjr closed 2 years ago

mpinjr commented 2 years ago

The new string value is always freeable (strdup'd by tostring), so never mark it as DONTFREE.

$ paste <(./leak.sh master.out) <(./leak.sh fixed.out)
 2564    2464
 7260    2476
11948    2476
16636    2476
21324    2476
26012    2476
30700    2476
35388    2476
40076    2476
44760    2476
49448    2476

$ cat leak.sh
#/bin/sh

awk=${1:-nawk}

# Print the initial resident set size
# Assign each record to FS
# Print rsz every 100,000 records
# Exit after a million

yes abcdefghijklmnopqrstuvwxyz | $awk -v mem='ps -p $PPID -o rsz=' '
    BEGIN            { system(mem) }
                     { FS = $0 }
    NR % 100000 == 0 { system(mem) }
    NR == 1000000    { exit }
'

Every built-in mentioned in setfree is affected. make check completed uneventfully.

Thank you for taking the time to consider this contribution, Miguel

plan9 commented 2 years ago

i'm a bit surprised with all the time spent tracking memory leaks in awk. it was never intended for building long-running programs, and if someone was really interested in doing something as strange as that, there is a good chance they would be working with gnu awk. but i'll review the changes you propose.

arnoldrobbins commented 2 years ago

@plan9 Changes look reasonable. These days people do all kinds of things with awk, and being memory clean is a good idea in general anyway. Note that the *BSD people avoid gawk due to its license, so they end up expecting more out of The One True Awk.