katef / libfsm

DFA regular expression library & friends
BSD 2-Clause "Simplified" License
934 stars 52 forks source link

libre leaks memory if certain errors are encountered during parsing #251

Open sfstewman opened 4 years ago

sfstewman commented 4 years ago

When the sid parser encounters an error, it does not appear to free any partially-constructed pieces of the AST tree.

I found this working with cvtpcre with ASAN enabled, but it's easily triggered with re if ASAN is enabled:

% ./build/bin/re -r pcre '^\ca\cA\c[;\c:'
/^\ca\cA\c[;\c:/:2: Syntax error: unsupported operator

=================================================================
==21525==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 80 byte(s) in 1 object(s) allocated from:
    #0 0x7f44a8892dc6 in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10ddc6)
    #1 0x559d6fbd13b4 in ast_make_expr_alt src/libre/ast.c:419
    #2 0x559d6fbb475f in p_expr src/libre/parser.act:671
    #3 0x559d6fbb89e7 in p_re__pcre src/libre/dialect/pcre/parser.c:3326
    #4 0x559d6fbbb27f in parse_re_pcre src/libre/parser.act:909
    #5 0x559d6fbcdeba in re_parse src/libre/re.c:111
    #6 0x559d6fbce156 in re_comp src/libre/re.c:154
    #7 0x559d6fb20de5 in main src/re/main.c:688
    #8 0x7f44a85ba0b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)

Direct leak of 80 byte(s) in 1 object(s) allocated from:
    #0 0x7f44a8892dc6 in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10ddc6)
    #1 0x559d6fbd0fd6 in ast_make_expr_concat src/libre/ast.c:371
    #2 0x559d6fbba396 in p_expr_C_Calt src/libre/parser.act:664
    #3 0x559d6fbb7716 in p_expr_C_Clist_Hof_Halts src/libre/dialect/pcre/parser.c:3116
    #4 0x559d6fbb478d in p_expr src/libre/dialect/pcre/parser.c:2558
    #5 0x559d6fbb89e7 in p_re__pcre src/libre/dialect/pcre/parser.c:3326
    #6 0x559d6fbbb27f in parse_re_pcre src/libre/parser.act:909
    #7 0x559d6fbcdeba in re_parse src/libre/re.c:111
    #8 0x559d6fbce156 in re_comp src/libre/re.c:154
    #9 0x559d6fb20de5 in main src/re/main.c:688
    #10 0x7f44a85ba0b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)

Direct leak of 48 byte(s) in 1 object(s) allocated from:
    #0 0x7f44a8892bc8 in malloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10dbc8)
    #1 0x559d6fb66dd0 in f_malloc src/adt/alloc.c:40
    #2 0x559d6fb4edd5 in fsm_new src/libfsm/fsm.c:51
    #3 0x559d6fb20bb1 in main src/re/main.c:653
    #4 0x7f44a85ba0b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)

Indirect leak of 4096 byte(s) in 1 object(s) allocated from:
    #0 0x7f44a8892bc8 in malloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10dbc8)
    #1 0x559d6fb66dd0 in f_malloc src/adt/alloc.c:40
    #2 0x559d6fb4eedc in fsm_new src/libfsm/fsm.c:60
    #3 0x559d6fb20bb1 in main src/re/main.c:653
    #4 0x7f44a85ba0b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)

Indirect leak of 80 byte(s) in 1 object(s) allocated from:
    #0 0x7f44a8892dc6 in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10ddc6)
    #1 0x559d6fbd1ea4 in ast_make_expr_anchor src/libre/ast.c:545
    #2 0x559d6fbb9d14 in p_expr_C_Cpiece_C_Catom src/libre/parser.act:727
    #3 0x559d6fbb3eed in p_expr_C_Cpiece src/libre/dialect/pcre/parser.c:2475
    #4 0x559d6fba9658 in p_expr_C_Clist_Hof_Hpieces src/libre/dialect/pcre/parser.c:953
    #5 0x559d6fbba3c4 in p_expr_C_Calt src/libre/dialect/pcre/parser.c:3685
    #6 0x559d6fbb7716 in p_expr_C_Clist_Hof_Halts src/libre/dialect/pcre/parser.c:3116
    #7 0x559d6fbb478d in p_expr src/libre/dialect/pcre/parser.c:2558
    #8 0x559d6fbb89e7 in p_re__pcre src/libre/dialect/pcre/parser.c:3326
    #9 0x559d6fbbb27f in parse_re_pcre src/libre/parser.act:909
    #10 0x559d6fbcdeba in re_parse src/libre/re.c:111
    #11 0x559d6fbce156 in re_comp src/libre/re.c:154
    #12 0x559d6fb20de5 in main src/re/main.c:688
    #13 0x7f44a85ba0b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)

Indirect leak of 64 byte(s) in 1 object(s) allocated from:
    #0 0x7f44a8892bc8 in malloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10dbc8)
    #1 0x559d6fbd10d7 in ast_make_expr_concat src/libre/ast.c:381
    #2 0x559d6fbba396 in p_expr_C_Calt src/libre/parser.act:664
    #3 0x559d6fbb7716 in p_expr_C_Clist_Hof_Halts src/libre/dialect/pcre/parser.c:3116
    #4 0x559d6fbb478d in p_expr src/libre/dialect/pcre/parser.c:2558
    #5 0x559d6fbb89e7 in p_re__pcre src/libre/dialect/pcre/parser.c:3326
    #6 0x559d6fbbb27f in parse_re_pcre src/libre/parser.act:909
    #7 0x559d6fbcdeba in re_parse src/libre/re.c:111
    #8 0x559d6fbce156 in re_comp src/libre/re.c:154
    #9 0x559d6fb20de5 in main src/re/main.c:688
    #10 0x7f44a85ba0b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)

Indirect leak of 64 byte(s) in 1 object(s) allocated from:
    #0 0x7f44a8892bc8 in malloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10dbc8)
    #1 0x559d6fbd14b5 in ast_make_expr_alt src/libre/ast.c:429
    #2 0x559d6fbb475f in p_expr src/libre/parser.act:671
    #3 0x559d6fbb89e7 in p_re__pcre src/libre/dialect/pcre/parser.c:3326
    #4 0x559d6fbbb27f in parse_re_pcre src/libre/parser.act:909
    #5 0x559d6fbcdeba in re_parse src/libre/re.c:111
    #6 0x559d6fbce156 in re_comp src/libre/re.c:154
    #7 0x559d6fb20de5 in main src/re/main.c:688
    #8 0x7f44a85ba0b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)

SUMMARY: AddressSanitizer: 4512 byte(s) leaked in 7 allocation(s).

Except for the 4096-byte struct fsm allocation, all of the allocations are done within the sid parser, and all are partial allocations of the

sfstewman commented 4 years ago

I'm not familiar enough with sid's error handling to know the best way to fix this.

One approach that essentially bypasses sid would be to create an "arena" for allocating AST nodes at each parse. The arena would essentially be a bump allocator. On an error or a call to ast_free, every object in the arena would be freed.

BenBE commented 4 years ago

Looking at the first few backtraces I wonder if there's a ast_free missing if ast_rewrite fails in https://github.com/katef/libfsm/blob/main/src/libre/re.c#L116-L118.

For the error in src/re/main.c:653 there's a fsm_free missing in all error exits (return EXIT_FAILURE;) in https://github.com/katef/libfsm/blob/main/src/re/main.c#L659-L990.

So no real need for a pooled allocator, when you just ensure allocated resources are free'd on all error paths.

hvdijk commented 4 years ago

Looking at the first few backtraces I wonder if there's a ast_free missing if ast_rewrite fails in https://github.com/katef/libfsm/blob/main/src/libre/re.c#L116-L118.

There is an ast_free missing there, but if parsing fails, it never gets to that point. The leaks reported here are in the parser itself.

I'm not familiar enough with sid's error handling to know the best way to fix this.

It is possible to define in parser.act:

%actions%
    <fail> = @{
            @!;
    @};
    <ast-free>: (node :ast_expr) -> () = @{
            ast_expr_free(@node);
    @};

Then, in the various parser.sid files, ensure that any rule that can potentially fail after an allocation is wrapped to produce this pattern:

node = ...;
{
    ...
##
    <ast-free>(node);
    <fail>();
};

This is similar to C++'s try { ... } catch { ast_expr_free(node); throw; }, although of course C++ has other ways to avoid the need for this.

sfstewman commented 4 years ago

Thanks for the idea @hvdijk! @katef also mentioned that. The downside is that the sid code ends up littered with try/catch blocks, and if you miss one you still have a leak.

I'm proposing #252 as an alternative. It's not quite complete (the thread safety bit), but it's less invasive overall.

The (imo minor) downside is that you don't release the AST nodes immediately, which could cause memory pressure if the regular expression is exceptionally large. We can improve this with a free list, but it's a fundamental downside of pooling allocation and releasing all at once. I think that's a worthwhile trade-off for the AST. Even large regular expressions have relatively small ASTs.

sfstewman commented 4 years ago

Looking at the first few backtraces I wonder if there's a ast_free missing if ast_rewrite fails in https://github.com/katef/libfsm/blob/main/src/libre/re.c#L116-L118.

For the error in src/re/main.c:653 there's a fsm_free missing in all error exits (return EXIT_FAILURE;) in https://github.com/katef/libfsm/blob/main/src/re/main.c#L659-L990.

So no real need for a pooled allocator, when you just ensure allocated resources are free'd on all error paths.

The failure happens well before ast_rewrite. The sid parser hits an error. In this case, \c is currently unimplemented and the error is triggered by the <err-unsupported> action.

hvdijk commented 4 years ago

Thanks for the idea @hvdijk! @katef also mentioned that. The downside is that the sid code ends up littered with try/catch blocks, and if you miss one you still have a leak.

I'm proposing #252 as an alternative. It's not quite complete (the thread safety bit), but it's less invasive overall.

That is true, it is easy to miss a spot.

The fundamental problem is that the parser stores state in local variables, but sid does not appear to provide a way to automatically invoke cleanup actions when any local variable is discarded because of an error condition. There are three ways around that: use a parser generator that does support that (that could still be sid, if it does have this functionality hidden somewhere, if its output were to be interpreted as C++ code or using GCC/clang's cleanup extension, or if sid were extended), manually write all the cleanup actions (what I suggested), or finding a way to keep track of allocated expressions that works even when variables are discarded (like you did).

Keeping track of all allocations in a separate list is certainly a nice, correct and easy to implement approach as well (without inspecting your PR in detail), and by pooling expressions together like you did, you reduce a lot of the overhead.

Another way to keep track of allocated expressions is to never return pointers that are not attached to a tree, by making actions take enough info so that whenever a new node is constructed, it can be attached to the tree immediately, even if further processing is going to fail. This would make it hard to miss a spot, but the downside of that is that it is possibly an even more invasive change.

The (imo minor) downside is that you don't release the AST nodes immediately, which could cause memory pressure if the regular expression is exceptionally large. We can improve this with a free list, but it's a fundamental downside of pooling allocation and releasing all at once. I think that's a worthwhile trade-off for the AST. Even large regular expressions have relatively small ASTs.

When constructing ASTs manually though, not from a regex, it is possible to repeatedly create and destroy expressions before the construction is complete. In that case, it becomes more important to not keep allocating more and more memory. This is not something the parser does, but it is something I am doing on a branch I have not (yet) created a PR for. I think your PR would be easy to extend to support that, so please do not take this as an argument against the pool approach :)

(I accidentally edited your comment instead of replying, sorry about that! Let me try to fix that...)

BenBE commented 4 years ago

A quite decisive argument against a pool allocator is that it interferes with normal memory checker instrumentation and may hide buffer overread, double-free and uninitialized memory read/write issues. A famous example of this was the OpenSSL Heartbleed bug.

The only real upshot for a pool allocator that I see would be the reduction of memory management calls, which could mean a large win on performance for large expressions.

On the other side: Keeping the allocator as is and writing a small tool to construct ASTs from some input string (and freeing it again) is an easy task which can be combined with a fuzzer like AFL+ to find memory leak issues in the AST construction code easily.

Thus going forward I'd lean more towards trying to get the generated parser code right (and have tooling to check this), than to risk hiding those bugs.

katef commented 4 years ago

@BenBE you're probably right about missing points where we should free, but don't currently (unrelated to the leaks during bad parses). I know I cut some corners for those. We should definitely be doing that properly as far as each program's main.c goes, regardless.

hvdijk commented 4 years ago

Doing this in the parser: if actions in the parser are rewritten to take :ast_expr & reference parameters, instead of returning :ast_expr by value, it is possible to avoid a lot of the cleanup that would otherwise be needed and structure it in a way that makes it easy to verify it covers all cases. I have put up a proof of concept that implements this only for the PCRE parser as https://github.com/hvdijk/libfsm/commit/06078792f29c60002fdb42c955e30321869ce95c. The idea is that the :ast_expr & parameter is guaranteed to be cleaned up by the caller, so does not need to be handled by the code that causes it a node to be constructed. There are only a few places that do node = <ast-expr>;, and for those places, it is easy to inspect that they guarantee either that an action will be invoked that takes over the responsibility to manage node, whether by inserting into the tree or by freeing it on error. (The option for pooled allocations may still be a good one; it mainly just did not sit right with me that sid would have such a fundamental limitation.)

sfstewman commented 4 years ago

A quite decisive argument against a pool allocator is that it interferes with normal memory checker instrumentation and may hide buffer overread, double-free and uninitialized memory read/write issues. A famous example of this was the OpenSSL Heartbleed bug.

@BenBE, I appreciate you taking the time to comment. It's clear that you feel passionately about this. IMO, few things in software lead to a "quite decisive argument." Mostly we just live in the world of trade-offs, and an offline FSM manipulation package focused on DFA transformation and codegen will make different choices than OpenSSL.

In terms of the downsides you list, consider that they're less a problem in this codebase than in others:

  1. Buffer overruns/overreads aren't generally an issue with this particular set of allocations. Most of the nodes are self-contained and don't hold pointers to internal data or additional data. The nodes that do hold lists of pointers use standard malloc/calloc to allocate their buffers. Instrumentation like ASAN and valgrind will still detect buffer overrun issues.
  2. Double-free isn't really an issue here, either. In general you either want all of the sub-tree or none of it. We could add an ast_expr_release() if becomes useful for AST rewriting, but as of right now, the rewriting rules are fairly limited in scope, and I didn't see an upside to adding release and free list.
  3. Uninitialized memory is always an issue, but the existing code tended to use calloc for allocation, which has the upside of initializing the values to a known quantity and the downside of defeating checks for uninitialized memory.

Custom allocators are also quite common, which is why ASAN and other instrumentation tools provide hooks that otherwise reduce or eliminate the downsides you mention. It would certainly be good for us to use these hooks in the places we have custom allocation when building with ASAN enabled.

The only real upshot for a pool allocator that I see would be the reduction of memory management calls, which could mean a large win on performance for large expressions.

I would consider these to be the upsides:

  1. The pool allocator PR (#252) required very few changes to actually fix the leaks. See the PR for details. With it, LSAN no longer complains.
  2. It's simple to understand and adding additional functionality to the parser is less likely to produce additional leaks. The pooled allocator approach reduces the overall maintenance burden for the current parsers and future parsers with additional features (eg: correctly handling /[foo/ in the PCRE dialect).
  3. Pooled allocation works pretty well for the way this library is typically used. It's a series of transformations: regexp->AST, AST->NFA, NFA->DFA, DFA->IR, IR->code OR dfavm IR, dfavm IR->code. Once one transformation is finished, the input is thrown away.
  4. The alternatives are quite intrusive. One alternative requires littering the sid code (both act and sid files) with lots of ## error handling cases. This strikes me as a maintenance nightmare. I haven't looked closely at @hvdijk's alternative, which sounds promising but does seem to invert the natural flow of allocation. Both of these require quite a few changes to the structure of all existing parsers.

On the other side: Keeping the allocator as is and writing a small tool to construct ASTs from some input string (and freeing it again) is an easy task which can be combined with a fuzzer like AFL+ to find memory leak issues in the AST construction code easily.

We're actively fuzzing the code for various things. LSAN checking would be another good thing to add to the list.

However, you don't have to fuzz the code to find obvious leaks. I'm still chasing down leaks that LSAN finds in the current build+tests outside of parsing.

Thus going forward I'd lean more towards trying to get the generated parser code right (and have tooling to check this), than to risk hiding those bugs.

As always, please feel free to file a PR that addresses this bug in a way that you would prefer. I find it a lot more helpful to have technical arguments about concrete code.

katef commented 4 years ago

@hvdijk thanks for the POC there, that's super helpful.

It seems like there are two things going on in your diff;

  1. moving the responsibility for where free() happens to inside each action;
  2. changing the interface for actions to take references:
    -   <ast-make-group>: (e :ast_expr) -> (node :ast_expr) = @{
    +   <ast-make-group>: (node :ast_expr &, e :ast_expr) -> () = @{

    Since node there is already a pointer type, I'm not sure I understand (2). What does introducing the reference give us? Unless I misunderstand, I think we could do (1) (and destructively free arguments) without switching to references.

hvdijk commented 4 years ago

The idea is that after node = <ast-expr>();, node is a null pointer. No expression is constructed yet at that point, this merely declares that node is a local variable in which an expression can be held. It's when <ast-make-xxx>(&node); is invoked that node gets assigned a non-null value. If you pass node by value, that does not work: an expression would be created, but after <ast-make-xxx>(node) would be done, node would still be a null pointer.

katef commented 4 years ago

I'm not sure I follow. If we're returning -> (node :ast_expr) wouldn't we bind that to a new variable in the .sid file?

a = <ast-expr>; // NULL
b = <ast-make-group(a); // would free a on error, per the idea for (1)

I think I see why you're doing this - do you then continue with the parse, after an allocation failure? Just passing around the reference to NULL?

Excuse me thinking aloud here, two alternate ways to get the idea of failing exposed:

The exceptions bubble up, so we don't need an ## alt for every production. But we would need to be sure we hadn't allocated something else in the same alt:

x = <f>;
y = <g>; // when <g> raises @!, we need to free `x`

We could handle just those cases explicitly perhaps, by wrapping them locally. But that's super messy.

I don't like the idea of a grammar file having to know about memory allocation at all. Its purpose is to describe a grammar, and this gets in the way.

hvdijk commented 4 years ago

I think I see why you're doing this - do you then continue with the parse, after an allocation failure?

After an allocation failure I still end the parse as before. I'm doing this to significantly reduce the number of places we need cleanup actions, and to make the locations where we need cleanup actions easily identifiable with a simple search, to address @sfstewman's concern that it would otherwise be too easy to miss a cleanup action and still end up with a leak on certain hard to find inputs (and it's good to be concerned about that!).

Returning an error status:

That could be a good idea too. The exceptions discard the local variables without performing any cleanup actions, so if we can avoid exceptions entirely, we can avoid using cleanup actions to handle the local variables being discarded.

I don't like the idea of a grammar file having to know about memory allocation at all. Its purpose is to describe a grammar, and this gets in the way.

I agree! Ideally this would be purely part of parser.act, where somehow we would specify what happens when an :ast_expr is discarded. Unfortunately sid does not support this, so we have to manually write code to track the variables. We can either do that inside the parser or outside of it. The pooling PR shows how this can be done outside of it, I am trying to show how it can be done inside of it.

sfstewman commented 4 years ago

I agree! Ideally this would be purely part of parser.act, where somehow we would specify what happens when an :ast_expr is discarded. Unfortunately sid does not support this, so we have to manually write code to track the variables. We can either do that inside the parser or outside of it. The pooling PR shows how this can be done outside of it, I am trying to show how it can be done inside of it.

It's good to see how sid can be adapted to this! The problem does make me wish (yet again...) that C had something like RAII or defer for things like this. I'm not sure which approach I prefer. I think the pooling PR (with some revisions) may be simpler, but this approach is certainly more in keeping with the rest of the libfsm.

hvdijk commented 4 years ago

Oh! This is a bit nasty, but we do actually have a way to run cleanup code without modifying sid: all sid-generated exceptions are guaranteed to be preceded by the RESTORE_LEXER macro, which we can hook into! Then it's just a matter of ensuring the same cleanup code is also run anywhere we explicitly invoke @!. This would allow us to avoid the need for explicit <ast-free> in the parser.sid files. I will attempt to update my POC to implement this!

katef commented 4 years ago

Oh dear :)

katef commented 4 years ago

where somehow we would specify what happens when an :ast_expr is discarded.

Do you think we could add this to sid somehow? There are similar %sections% for implementing copying, %assignments%, etc. Maybe we could have one for destructors? Maybe that's just too much work.

katef commented 4 years ago

I do like the references idea, I want to be clear about that.

hvdijk commented 4 years ago

Here's the RESTORE_LEXER hack: https://github.com/katef/libfsm/commit/8dfe6124892aa18b826c9193f8c265f3aa6d26e9 (comment edited to point to an updated version) ! This builds on the refcounted expression trees I had already implemented on another branch. Beware, it's still a work in progress, it's not fully correct yet.

Do you think we could add this to sid somehow? There are similar %sections% for implementing copying, %assignments%, etc. Maybe we could have one for destructors? Maybe that's just too much work.

This should be relatively easy to add to sid for someone already familiar with its internals!

katef commented 4 years ago

This is really quite an incredible diff. I'm amazed that it works.

hvdijk commented 4 years ago

Yeah, it is quite nasty. It manages to keep a relatively clean parser.sid though, so it does have that going for it.

katef commented 4 years ago

Okay, well thank you everybody for all the input here.

I could imagine adding a %destructors% style section to sid, to provide action for freeing each type. But honestly I don't want to put the work into that, when somehow we haven't managed to need it so far. I did a lot of work on sid some time ago, and going back to that feels like such a digression from what I want to be doing now.

The pool allocation system gets my vote, so far. Mostly because I think it's the simplest to understand, and perhaps the most maintainable idea without raising the barrier for concepts that're strange and difficult for new people. If we ever want to replace it in the future, it's easy to see where it is, what it does, and what its surface area touches.

I do need it to be thread safe though, so let's pass in a handle for a pool, rather than the global_pool at file scope.

Thanks for trying out the proof of concepts here, I'm finding it difficult to think clearly at the moment, and seeing those laid out really does help.

sfstewman commented 4 years ago

@katef sounds good. I do like the idea of a %destructors% section, but I also have no desire to learn enough of the insides of sid to design or implement it.

I'm still hunting other leaks, but as soon as I reach a good stopping place, I'll clean up the pooled allocation scheme:

  1. Fix up the thread safety issue.
  2. Add ASAN poisoning/unpoisoning and some "red zone" buffers between pool items when -DASAN is used. This isn't a perfect solution, but should help with many of the issues that @BenBE has raised.
hvdijk commented 4 years ago

I could imagine adding a %destructors% style section to sid, to provide action for freeing each type. But honestly I don't want to put the work into that, when somehow we haven't managed to need it so far. I did a lot of work on sid some time ago, and going back to that feels like such a digression from what I want to be doing now.

Yeah, that makes sense. The main reason it has not really been a problem is that sid-generated parsers are almost always part of a utility where if parsing fails, the utility will abort fairly quickly and free any lingering resources implicitly, but here it is part of a library, so that logic does not hold.

Just a note on %destructors% if that is implemented at some point in the future: it may be easiest to do if %constructors% are added as well. This would simplify the implementation: sid could run a constructor action whenever a block is entered and a destructor action whenever a block is exited, without keeping track of whether the variable has been assigned to yet.

hvdijk commented 4 years ago

I just had a realisation that there is another way to keep track of constructed nodes, without a pool, without sid extensions, and without resorting to hacks: just keep a list of nodes that have been constructed but not yet attached to any other node, and update that list when attaching to other nodes, like so: https://github.com/hvdijk/libfsm/commit/87aaad11221f0a049b08432fad54f821b8e714b7 This only modifies ast.h to add a single new field, and parser.act, everything else is left intact, tests pass, and valgrind reports no leaks for any of the syntax error tests in the test suite. (As before, I'm not saying this is better than the pool, I wanted to see that it could be done.)

katef commented 9 months ago

@hvdijk I really like that idea, more than the pool