pulsar-edit / pulsar

A Community-led Hyper-Hackable Text Editor
https://pulsar-edit.dev
Other
3.3k stars 138 forks source link

language-php #787

Open syrsly opened 1 year ago

syrsly commented 1 year ago

Thanks in advance for your bug report!

What happened?

When the "language-php" package is enabled and I'm in a PHP script, if I create a "(" character but don't close it, this screws up the formatting for everything after it for the rest of the script, whether it's 10 lines or 7000 lines and regardless of whether or not I later close the parenthesis.

This seems like it's an issue that's been around and mentioned before in the Atom issues, but since it hasn't been addressed in years and is now an issue in Pulsar, I figured I should mention it.

I propose that we add some kind of logic to the language-php package that says, if we can't find a closing parenthesis character ")" by the end of the line, we reset the formatting to normal. Could also add some kind of toggle to turn off the string formatting in case it annoys anyone.

Pulsar version

1.110.0

Which OS does this happen on?

🪟 Windows

OS details

Windows 10 Pro

Which CPU architecture are you running this on?

x86_64/AMD64

What steps are needed to reproduce this?

  1. Enable language-php package with default settings,
  2. open a php file,
  3. insert a string variable with the start of parenthesis "(" character in the string,
  4. create more lines of code after this string and notice the later lines are all appearing to be part of the string.

Additional Information:

Old issue: https://github.com/atom/language-php/issues/442 firefox_bwiaLfRVHw

Spiker985 commented 1 year ago

This is presumably due to the fact that this language is a TextMate grammar, and not a Tree-Sitter grammar

What the ends up meaning is that all of the syntax highlighting is performed via regex.

So, once I get to line 7, I ran editor:log-cursor-syntax-tree-scope and it spits out

text.html.php
meta.embedded.block.php
source.php
string.quoted.double.sql.php
source.sql.embedded.php
string.quoted.double.sql

When read top-to-bottom, shows that there's actually a hand-off to another language, language-sql on line 5, source.sql.embedded.php.

source.sql in the scope name for language-sql, and is affixed with embedded.php since that is the scope of where it is. embedded instead of source, php because that is the parent-scope the language.

So, the problem isn't that language-php is wrong per se, after all, it has handled off syntax highlighting to language-sql. But the problem is that language-sql doesn't know what to do with a open parenthesis pair.

The first double quote is considered to be php, but then the AND verb changes it to sql. Afterwards without a closing parenthesis - the syntax gets highlighted as being an unfinished SQL string because there are three double quotes technically within scope (while there are 4 double quotes, 1 belongs to php and the other 3 belong to SQL because the scope of the parenthesis hasn't been closed).

Which is why the highlighting works here just fine, and doesn't in your example - because the linked comment is using backtick for its embedded SQL instead of additional double quotes (so there's are two double quotes, scoped to php and the rest of the SQL strings are wrapped with backtick instead).

For succinctness, this is the code:

<?php 
if (isset($value)) {

    foreach($value as $valuex) {
        if ($ii === 0) {
            $sql = $sql . " AND ($key LIKE '" . $valuex . "'";
            #$sql = $sql . " AND ($key LIKE " . $valuex ;
        }
        if ($ii > 0) {
            $sql = $sql . " OR $key LIKE '" . $valuex . "'";
        }
        $ii++;
    }
} else {
    $sql = $sql . " AND ($key='0'";
}

$sql = $sql . ")";
?>
Spiker985 commented 1 year ago

Amendment: The reason the GitHub syntax highlighting works, is due to the fact that GitHub utilizes Tree-Sitter grammars

savetheclocktower commented 11 months ago

I've got a PHP Tree-sitter grammar that's more or less ready, and I keep forgetting to put a PR together. I'll try to get that done sometime this week. I can confirm that this issue is fixed in said grammar.

The main problem in the TextMate grammar is that it tries to be clever and inject the SQL grammar into any string that looks like it could be SQL. I'm not certain that I want to replicate that strategy for the Tree-sitter grammar (for which I haven't yet implemented any special highlighting of SQL queries), but it would be safer to attempt that strategy because injections wouldn't be able to bleed out into the parent grammar like we're seeing in this bug. Tree-sitter would say “SQL grammar, you may inject into this specific range of the buffer” and it would do the best it could at highlighting that area.

confused-Techie commented 6 months ago

@savetheclocktower Considering a php Tree-sitter grammar was added in #852 are we good to close this one?