idank / bashlex

Python parser for bash
GNU General Public License v3.0
550 stars 94 forks source link

ANSI-C quoted strings $'' aren't supported #60

Open verhovsky opened 3 years ago

verhovsky commented 3 years ago

https://www.gnu.org/software/bash/manual/html_node/ANSI_002dC-Quoting.html

Expected result:

>>> list(bashlex.split("echo $'hello'"))
['echo', 'hello']
>>> list(bashlex.split("echo $'hello\\nworld'"))
['echo', 'hello\nworld']  # notice \\n becomes a real newline character \n

Actual result (bashlex 0.15):

>>> list(bashlex.split("echo $'hello'"))
['echo', '$hello']
>>> list(bashlex.split("echo $'hello\\nworld'"))
['echo', '$hellonworld']
d4g33z commented 3 years ago

This may be a similar issue, or my own misunderstanding of bashlex:

>>> bashlex.parse('DOCKER_BUILDTAGS+=" $tag"') [CommandNode(parts=[AssignmentNode(parts=[ParameterNode(pos=(20, 24) value='tag')] pos=(0, 25) word='DOCKER_BUILDTAGS+= $tag')] pos=(0, 25))]

I expected the following output instead:

[CommandNode(parts=[AssignmentNode(parts=[ParameterNode(pos=(20, 24) value='tag')] pos=(0, 25) word='DOCKER_BUILDTAGS+=" $tag"')] pos=(0, 25))]

verhovsky commented 3 years ago

@d4g33z that is unrelated to this issue and bashlex lexes your example correctly. This is how bash interprets that:

$ tag="world"
$ echo hello="   $tag"
hello=   world
$ 

In other words, echo gets one argument with $tag substituted for the value of the tag variable and the quotes removed. The " quotes are needed so that you can have whitespace in one argument (otherwise echo hello= $tag would pass two arguments to echo: hello= and $tag, which it would print separated by 1 space instead of however many I put in there, and that's just because of how echo works, it prints arguments separated by a space), they are stripped by bash before passing the args to the command being called.

d4g33z commented 3 years ago

Sorry to hijack the issue. I'll have to revisit parsing += assignments that concatenate strings with white space.

Thanks, it's a great library.