jeff-hykin / better-shell-syntax

💾 📦 ♻️ An improvement to the shell syntax for VS Code
MIT License
50 stars 4 forks source link

Command substitution in quotes doesn't recognize closing quote as closing #92

Closed alexr00 closed 3 days ago

alexr00 commented 1 month ago

The code with a problem is:

#!/bin/bash

echo "Works 1"
( TEST=$(echo {1..10}); echo "$TEST" )

echo "Works 2"
( TEST="$(echo {1..10})"; echo "$TEST" )

echo "Works 3"
( TEST=$(for i in {1..10}; do echo $i; done); echo "$TEST" )

echo "Works 4"
( TEST="$(for i in $(echo {1..10}); do echo $i; done)"; echo "$TEST" )

echo "Works 5"
( TEST="$(echo {1..10} | while read line; do echo $line; done)"; echo "$TEST" )

echo "Syntax Highlighting Broken"
( TEST="$(for i in {1..10}; do echo $i; done)"; echo "$TEST" )

exit 0

It looks like:

image

It should look like:

It looks like the closing quotes of the string is instead recognized to be opening quotes.

Originally from @stephanwehr in https://github.com/microsoft/vscode/issues/214104

wdeshazer commented 3 weeks ago

@alexr00 and @jeff-hykin The issue is that the '{' is disrupting the string continuation. The TextMate parser is appying the context identifications in the wrong order. Either the double-quote and bracket have been given equal precedence to identify contexts (@jeff-hykin I seem to recall a whole string of operators and punctuation that were between or brackets '[ ]') or the '{' has greater precedence. I suspect equal, which is clearly wrong. Below are examples that illustrate the behavior.

VSCode Syntax Highlighting

Two takeways

  1. Syntax Highlighting hack I lerned long ago
    • Put a #" at the end of the offending line
    • Will reset syntax highlighting
    • Won't interfere with bash Line Editor
  2. The '{' is the culprit
    • Given precedence equal to or over quotes and double quotes
    • Not contextually aware because it doesn't look for closing }
    • Parser disrupted with "{
Code Block with script ```shell #!/bin/bash # original example ( TEST="$(for i in {1..10}; do echo $i; done)"; echo "$TEST" ) # Inserting a quote the furthest left it can be moved # grammatically meaningless # Illustrates that the '{' is given precedence # Not even looking for a closing bracket # This implies that the algoritm is not context aware ( TEST="$(for i in {"1..10}; do echo $i; done)"; echo "$TEST" ) # Notice the #" hack to trick the syntax parser # without disrupting bash ( TEST="$(for i in {1..10}; do echo $i; done)"; echo "$TEST" ) #" # Inserting a quote the furthest left it can be moved # grammatically meaningless # Illustrates that the '{' is given precedence # Not even looking for a closing bracket # This implies that the algoritm is not context aware ( TEST="$(for i in {"1..10}; do echo $i; done)"; echo "$TEST" ) # this illustrates that the bracket is the disrupter # The '"{' confused the Parsing algorithm # Notice though it confused the parser # for both the double-quotes and the bracket expansion # I actually find that to be a thought provoking hack. ( TEST="$(for i in "{1..10}; do echo $i; done)"; echo "$TEST" ) # " VSCode & Github # closing quote for original # This does not produce the correct bracket expansion # It just shows where the parser jumped contexts ( TEST="$(for i in "{1..10}"; do echo $i; done)"; echo "$TEST" ) # Neither balanced quotes ```

image

Recommendations

Broadly speaking, I believe the context identification scheme needs to be refactored so that it is more contextually aware. The two obvious ones from this issue are, the Quote context and the bracket context.

The reason the current algorithm is failing is that the strategy clearly applies an approach where it triggers a context, applies some token rules until another context is identified, switches over to the new contex and then repeats. That single sweep approach is not robust. To be clear, I don't have detailed knowledge of the approach Better-Syntax / TextMate is employing, so forgive me if some of my assertions need to be corrected to prevent confusion.

The following regex is not robust, but I just want to illustrate that one can first find all quoted contexts before moving on to bracket contexts {} [] () etc. etc.

"[^"]+"

image

jeff-hykin commented 3 days ago

I believe the context identification scheme needs to be refactored so that it is more contextually aware

Yep, you're preaching to the choir lol. PR's are welcome, I do this in my free time.

the double-quote and bracket have been given equal precedence

Textmate doesn't really know what precedence is. The pattern that matches the most characters wins.

one can first find all quoted contexts

Well, now that I've mentioned textmate doesn't have presedence, you can see why that actually might not be possible. This isn't taking into account the differences between single-line patterns and multi-line patterns.

Anyways, this one instance of the issue should be fixed for now on v1.8.8.

jeff-hykin commented 3 days ago

@alexr00 Just wanted to let you know v1.9.0 is published from the gplv3 branch of this repo with the GPLv3 license.

But, the master branch (including v1.8.8), is still under MIT, and will remain that way for the forseeable future. I do plan to still update master with minor fixes. I may even end up mirroring all changes onto master. But I'll be putting features/enhancements on gplv3 first.

alexr00 commented 3 days ago

@jeff-hykin thank you for the heads up and for continuing to maintain the master branch with minor fixes!