42-Ikole-Systems / TMK-SH

An awesome POSIX compliant shell.
MIT License
0 stars 0 forks source link

Add recursive rule referencing #26

Closed Tishj closed 1 year ago

Tishj commented 1 year ago

This might not be the right way to describe it, but essentially a rule can contain itself as a recipe. The way yacc/bison deals with this is that rules are only matched when the stack of produced rules matches the recipe.

It has a "Reduce" pass, where it reduces the recipe rules into the matched production rule. This is how it can match itself, at first the self-referencing rule does not get matched because the stack does not contain the rule.

Only when another recipe for the rule gets matched, this gets "reduced" to the production rule, at which point the self-referencing rule matches.

To illustrate this, let's take this example:

cmd_prefix       :            io_redirect
                 | cmd_prefix io_redirect
                 |            ASSIGNMENT_WORD
                 | cmd_prefix ASSIGNMENT_WORD

When an io_redirect rule gets matched, this produces a cmd_prefix node at the top of the stack. So the [cmd_prefix, io_redirect] recipe can get matched.

Tishj commented 1 year ago

I think the way we can implement this same behavior is by using the options function pointer to determine if a recipe (non-terminal) for a rule contains that same rule.

If this is the case, an earlier production for this rule (only for the direct parent) should be provided to the parsing of this non-terminal so this production can be fed to the recipe (instead of endlessly recursing)

When a rule gets matched, we call the parse for this rule again, with the produced production, so it can be fed to recursive rules.

Tishj commented 1 year ago

This feels kind of hacky though, and needs to be done with a lot of care to make sure that rollbacks work and that a failed recipe does not consume the previously made production so the next recipe could still match etc..

Tishj commented 1 year ago

What we instead might want to do is (slightly) redesign the flow to be more similar to yacc/bison behavior

Where a stack of produced rules is maintained, every matched rule contains an associated Node. To determine if a recipe matches, rules are popped off the stack.

Once a recipe matches, its rule and the produced Node are pushed onto the stack so it can be matched by other recipes.

This does introduce an extra bit of state that needs to be saved/reset, but since we already have a mechanism for saving and resetting state, this shouldn't be difficult to implement

Tishj commented 1 year ago

My approach has been: Let the rules guide which Nodes to produce With this approach we start from a base rule, this has multiple recipes Once a recipe matches, produce that Node, and in this way it folds back into the base case. This means that we need to evaluate the rule first to determine if it matches, and that clashes with the recursive rules for two reasons:

  1. A rule can get matched 1 or more times, instead of the one time that the recipe describes, so we need special logic to determine if it can match a second time, and if not return the last time it matched.
  2. If a rule contains itself, this causes a stack overflow because we will keep evaluating this condition, constantly going deeper into recursion until we hit the maximum depth.

What might be the better approach is to produce nodes, and match these with rules I believe the stack based approach that yacc/bison use does this, they work from the terminals symbols down to bigger productions.

I think every symbol keeps track of a list of rules that they are part of, and matching happens from this list. That also explains how the recursive rules work.

It's not even really "recursive" in this case. Tokens get pushed on to the stack as Terminal Symbols, from the bottom of the stack, we start matching rules, based on the type of this Symbol. When a rule gets matched, it consumes all the symbols that make up the recipe for this rule, and push back a new Symbol. Then, just like before, based on the the of this symbol rules get matched.

Since there is a recipe that starts with the produced symbol (the "recursive" one), we attempt to match it.

Tishj commented 1 year ago

This wont be fixed, we have accepted that we just need to set up the rules in the shell grammar in such a way that enable recursion

i.e the option containing the recursive rule should be checked first the recursive rule itself should be the last rule of the option