ScintillaOrg / lexilla

A library of language lexers for use with Scintilla
https://www.scintilla.org/Lexilla.html
Other
179 stars 64 forks source link

R escape sequences don't terminate properly #206

Closed kugel- closed 11 months ago

kugel- commented 1 year ago

I upgraded Lexilla for Geany to version 5.2.6 and found a problem with the new espace sequence support for the R programming language. They don't seem to terminate properly, or rather the enclosing string doesn't terminate.

See 1

If I add a char after the espace sequence the colors it looks better (but of course the code has different semantics), see

2

nyamatongwe commented 1 year ago

Issues should include examples as text so that they can be more easily reproduced.

rdipardo commented 1 year ago

The end of the escape sequence is determined only when the closing quote is passed to EscapeSequence::atEscapeEnd. The position is then incremented beyond the closing quote, so the state remains SCE_R_STRING until the next string delimiter is encountered. Possibly some kind of look-behind condition would suffice, like this:

--- a/lexers/LexR.cxx
+++ b/lexers/LexR.cxx
@@ -187,6 +187,8 @@ void ColouriseRDoc(Sci_PositionU startPos, Sci_Position length, int initStyle, W
                || (sc.state == SCE_R_STRING2 && sc.ch == '\'')
                || (sc.state == SCE_R_BACKTICKS && sc.ch == '`')) {
                sc.ForwardSetState(SCE_R_DEFAULT);
+           } else if (sc.MatchLineEnd()
+               && escapeSeq.atEscapeEnd(sc.chPrev)
+               && escapeSeq.outerState == sc.state) {
+               sc.SetState(SCE_R_DEFAULT);
            }
            break;
rdipardo commented 1 year ago

Here's a testable patch:

GH-206-Terminate-Esc-Seq-at-EOL.zip

The *.properties config was modified to avoid breaking an existing test.

kugel- commented 1 year ago

Seems to work, thanks

grafik

@nyamatongwe How do you normally reproduce/validate these things? I found scite doesn't support R unless I did something wrong?

nyamatongwe commented 1 year ago

I found scite doesn't support R

Remove r from imports.exclude.

kugel- commented 1 year ago

Alright, thanks. So just for completeness, the issue is also reproducible with scite

grafik

Here's the text for easier copy&paste:

m <- "foo"
m <- "fo\n"
m <- "fo\\"
m <- r"(bar\\x\x)"
m <- `echo hi\{foo\}`
zufuliu commented 1 year ago

The failure is caused by sc.Forward() after continue: https://github.com/ScintillaOrg/lexilla/blob/ecacef4ccae75bdc0c123f8996a8ad4dbe30dd9e/lexers/LexR.cxx#L193-L201

Change the for loop to while loop (as Bash lexer did) would be the simplest fix, test case:

"\n"
"\r\n"
diff --git a/lexers/LexR.cxx b/lexers/LexR.cxx
index 7d8638d5..b3319760 100644
--- a/lexers/LexR.cxx
+++ b/lexers/LexR.cxx
@@ -124,7 +124,7 @@ void ColouriseRDoc(Sci_PositionU startPos, Sci_Position length, int initStyle, W
        dashCount = lineState >> 8;
    }

-   for (; sc.More(); sc.Forward()) {
+   while (sc.More()) {
        // Determine if the current state should terminate.
        switch (sc.state) {
        case SCE_R_OPERATOR:
@@ -272,6 +272,7 @@ void ColouriseRDoc(Sci_PositionU startPos, Sci_Position length, int initStyle, W
            const int lineState = matchingDelimiter | (dashCount << 8);
            styler.SetLineState(sc.currentLine, lineState);
        }
+       sc.Forward();
    }
    sc.Complete();
 }

LexR-0923.patch

rdipardo commented 1 year ago

Change the for loop to while loop (as Bash lexer did) . . .

Placing the Bash lexer in rather good company.

$ git grep -lE "while\s*\(sc.More\(\)\)" -- lexers
lexers/LexAda.cxx
lexers/LexAsciidoc.cxx
lexers/LexBash.cxx
lexers/LexCaml.cxx
lexers/LexFSharp.cxx
lexers/LexGui4Cli.cxx
lexers/LexHaskell.cxx
lexers/LexHex.cxx
lexers/LexMarkdown.cxx
lexers/LexR.cxx
lexers/LexRaku.cxx
lexers/LexSML.cxx
lexers/LexSmalltalk.cxx
lexers/LexSpice.cxx
lexers/LexTADS3.cxx
lexers/LexTxt2tags.cxx
lexers/LexVHDL.cxx
kugel- commented 1 year ago

@zufuliu can you make PR with the fix?

zufuliu commented 1 year ago

@zufuliu can you make PR with the fix?

not sure how it make sense (as most PR won't be merged), but the draft PR is #207.

nyamatongwe commented 1 year ago

can you make PR with the fix?

I haven't managed to work out how to integrate PR application into sensible workflows so prefer .patch files although PRs can be converted into .patch files. Its almost always necessary to add an item to doc/LexillaHistory.html over the patch and update the commit message.

kugel- commented 1 year ago

FTR, I have this in my .gitconfig:

[alias]
    get-pr = !bash -c 'git fetch origin pull/$1/head && git checkout -B pr-$1 FETCH_HEAD' -`

This checks out the PR to a new branch with e.g. git get-pr 206.

Essentially, github maps all PRs automatically in the pull namespace, so you can always get it locally. You could also merge a PR for testing in your current local master by doing git fetch origin pull/206/head && git merge FETCH_HEAD (perhaps merging with --squash if you prefer).

zufuliu commented 1 year ago

This checks out the PR to a new branch

here is better tool for this https://cli.github.com/ https://cli.github.com/manual/gh_pr_checkout