onetrueawk / awk

One true awk
Other
1.98k stars 159 forks source link

Fix two use-after-free bugs when freeing a symbol table. #157

Closed millert closed 2 years ago

millert commented 2 years ago

This introduces freesymtab2() which frees all elements of the symbol table except y. The bugs can be reproduces with the following input under valgrind or address sanitizer:

'BEGIN { unireghf() } function unireghf(hfeed) { hfeed[1] = 0 }'

'BEGIN { a[1]="a b"; print split(a[1],a),a[1],a[2] }'
mpinjr commented 2 years ago

(I deleted my incorrect explanation of the consequences of using freesymtab2 in split.)

I think the correct fix for split is to call tempfree earlier. Once we have the string, the cell is no longer needed. If it came from the array which split will delete, it's no longer a problem.

diff --git a/run.c b/run.c
index df616fc..7ea6590 100644
--- a/run.c
+++ b/run.c
@@ -1266,6 +1266,7 @@ Cell *split(Node **a, int nnn)    /* split(a[0], a[1], a[2]); a[3] is type */

        y = execute(a[0]);      /* source string */
        origs = s = strdup(getsval(y));
+       tempfree(y);
        arg3type = ptoi(a[3]);
        if (a[2] == NULL)               /* fs string */
                fs = getsval(fsloc);
@@ -1386,7 +1387,6 @@ Cell *split(Node **a, int nnn)    /* split(a[0], a[1], a[2]); a[3] is type */
                }
        }
        tempfree(ap);
-       tempfree(y);
        xfree(origs);
        xfree(origfs);
        x = gettemp();

Regarding the use-after-free that you detected in the function epilogue, I don't think freesymtab2 is appropriate there either. I am nearly done with a diff which avoids double-freeing and use-after-free while significantly simplifying both the prologue (setting up the function's argument list before execution) and the epilogue (updating any uninitialized cells from the caller which were passed by value but during execution became arrays). The diff already passes the test suite and valgrind no longer reports any invalid reads from use-after-free. I'm taking my time with it because of the many subtleties and gotchas in this area (the test suite has been invaluable).

I will submit pull requests within the week for this and my regexpr alternative (if the issue is still pending).

Take care, Miguel

millert commented 2 years ago

Calling tempfree(y) earlier does seem like the better approach. I'm closing this PR for now in favor of your changes.

millert commented 11 months ago

@mpinjr Did you ever finish your prologue and epilogue changes? It looks like the function epilogue use-after-free bug still exists.

mpinjr commented 11 months ago

Hi @millert,

I was exploring alternative solutions when spare time became scarce, but I probably have one or two branches that pass the test suite. I have a busy weekend ahead of me, but I will reacquaint myself with that work tomorrow (Fri) and resume work on it on Monday. I'll have something by next Friday (for real, cross my heart).

Take care, Miguel

millert commented 11 months ago

@mpinjr Thanks! I appreciate you spending time on this.

mpinjr commented 11 months ago

Appreciate your patience, Todd.