idank / bashlex

Python parser for bash
GNU General Public License v3.0
552 stars 94 forks source link

bashlex.split() - Strange quoting behaviour with variable assignments #26

Open joerick opened 7 years ago

joerick commented 7 years ago

I'm seeing a strange bug with variable assignments

>>> list(bashlex.split("PATH=\"$PATH:/usr/local/bin/\""))
['PATH="$PATH:/usr/local/bin/"']
       ^                     ^
#      note the quote marks /

>>> list(bashlex.split("PATH2=\"$PATH:/usr/local/bin/\""))
['PATH2=$PATH:/usr/local/bin/']

#     the quote marks are gone!

In the above example, it seems to be the number in the env var name that triggers the removal of quotes.

The following example shows that a preceeding var assignment with a number in the name will trigger the different quote behaviour.

>>> list(bashlex.split("VAR_ABC=1 PATH=\"$PATH:/usr/local/bin/\""))
['VAR_ABC=1', 'PATH="$PATH:/usr/local/bin/"']
                    ^                     ^
#                   note the quote marks /

>>> list(bashlex.split("VAR_123=1 PATH=\"$PATH:/usr/local/bin/\""))
['VAR_123=1', 'PATH=$PATH:/usr/local/bin/']

#     the quote marks are gone!

Retaining the quotes is desirable for my use case. I can workaround, so I'm just wondering if this is a bug in bashlex or some strange bash behaviour.

joerick commented 7 years ago

This might be a clue...

>>> print bashlex.parsesingle("PATH=\"$PATH:/usr/local/bin/\"").dump()
CommandNode(pos=(0, 28), parts=[
  AssignmentNode(pos=(0, 28), word='PATH=$PATH:/usr/local/bin/', parts=[
    ParameterNode(pos=(6, 11), value='PATH'),
  ]),
])

>>> print bashlex.parsesingle("PATH2=\"$PATH:/usr/local/bin/\"").dump()
CommandNode(pos=(0, 29), parts=[
  WordNode(pos=(0, 29), word='PATH2=$PATH:/usr/local/bin/', parts=[
    ParameterNode(pos=(7, 12), value='PATH'),
  ]),
])

Without a number, the bit parses as an assignment node, but with a number it parses as a word node 🤔

idank commented 7 years ago

Huh, that's weird. Usually what I've done in cases like this is step through bash's source code and see how it parses it compared to bashlex. Then fixing it becomes easier.