Open Tishj opened 1 year ago
Some expansions can create multiple fields
It is only field splitting or pathname expansion that can create multiple fields from a single word.
The single exception to this rule is the expansion of the special parameter '@' within double-quotes, as described in Special Parameters.
It actually seems like Expansions are applied as part of Execution, not as part of parsing Though some grammar rules are applied which do use expansions.
That's actually the exception rather than the norm, which I had backwards for a long time
I think this line in the linked section clears up this confusion:
The expansions described in this section shall occur in the same shell environment as that in which the command is executed.
Since we can add command-specific environment variables, they have to be included. And to achieve that we have to first fork for the command, which we'll only do in the execution for the command.
Expansions happen after forking
@mraasvel I saw a comment in the executor with a suggestion as to where to implement the expansions, I'm not sure that works
What might work is creating a virtual method expand
to implement on every node type.
It looks like some expansions happen directly in the parser while others are delayed until execution.
I think both can work. I was thinking the expander to be stateless, but configurable. Since depending on the context of the expansion, the behavior is different (e.g. heredoc string expansion is only quote removel, and not variable/command/parameter expansion).
I think the case where expansions can sometimes yield multiple tokens is related to field splitting
, right? I'm wondering whether this is a distinct step after expansion (perhaps a separate function that yields a vector of tokens from a given input token, or iterator over tokens...)
The executor could depend on some Expander object and configure it to only expand what it needs. The node type could also handle the expansion, but then does the node type also handle things like field splitting? I personally have a preference towards having a separate class manipulating and depending on the nodes.
I haven't looked into it too much yet, though.
https://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_06
It sounds like this should be handled in the logic of consuming a Token from the Lexer in the Parser. (the terminal symbol grammar rules)