ScintillaOrg / lexilla

A library of language lexers for use with Scintilla
https://www.scintilla.org/Lexilla.html
Other
163 stars 59 forks source link

prevent negative code folding level in lexer #224

Open zufuliu opened 5 months ago

zufuliu commented 5 months ago

See https://github.com/zufuliu/notepad2/issues/745, when braces or (code folding word pairs) become unbalanced, code folding may totally messed up after the line, which looks ugly.

to fix this (as some kind of error recover), most lexer requires only a single line change before setting folding level (e.g. after atEOL):

levelNext = std::max(levelNext, SC_FOLDLEVELBASE);
// or
levelCurrent = std::max(levelCurrent, SC_FOLDLEVELBASE);
zufuliu commented 5 months ago

Complex fix may like Ruby lexer that guards each decrement.

rdipardo commented 5 months ago

Complex fix may like Ruby lexer that guards each decrement.

Folded groups of line comments may be a challenge, too (cf. https://github.com/ScintillaOrg/lexilla/issues/56#issuecomment-1023767431):

lexbash-comm-appen
zufuliu commented 5 months ago

Folded groups of line comments may be a challenge,

it works in Notepad2 (where code folding code is much simpler):

Animation

zufuliu commented 5 months ago

it works in Notepad2

Due to Notepad2 always move back one line before call fold method, https://github.com/zufuliu/notepad2/blob/main/scintilla/lexlib/LexerBase.cxx#L83

add following block make SciTE works too:

void SCI_METHOD LexerBash::Fold(Sci_PositionU startPos_, Sci_Position length, int initStyle, IDocument *pAccess) {
    if(!options.fold)
        return;

{
    Sci_Position lineCurrent = pAccess->LineFromPosition(startPos_);
    // Move back one line in case deletion wrecked current line fold state
    if (lineCurrent != 0) {
        lineCurrent--;
        const Sci_Position newStartPos = pAccess->LineStart(lineCurrent);
        length += startPos_ - newStartPos;
        startPos_ = newStartPos;
        initStyle = 0;
        if (startPos_ != 0) {
            initStyle = pAccess->StyleAt(startPos_ - 1);
        }
    }

}

    LexAccessor styler(pAccess);
zufuliu commented 5 months ago

add following block make SciTE works too

code come from LexerModule::Fold() method, https://github.com/ScintillaOrg/lexilla/blob/master/lexlib/LexerModule.cxx#L107

nyamatongwe commented 5 months ago

when braces or (code folding word pairs) become unbalanced, code folding may totally messed up after the line, which looks ugly

Unbalanced braces are generally incorrect code and should be fixed. While the current 'messed up ugly' display doesn't strongly indicate the problem, hiding that anything is wrong seems counterproductive to improving the code.

zufuliu commented 5 months ago

hiding that anything is wrong seems counterproductive to improving the code.

It does not hide bug in the code, just make code after some portion (e.g. after error recovered) looks good. image

zufuliu commented 5 months ago

add following block make SciTE works too:

patch added "Move back one line" block (don't known better fix), and convert if-else to switch.

Bash-Fold-0112.patch

consecutive line comment folding block:

if (!IsCommentLine(lineCurrent - 1, styler)
    && IsCommentLine(lineCurrent + 1, styler))
    levelCurrent++;
else if (IsCommentLine(lineCurrent - 1, styler)
         && !IsCommentLine(lineCurrent + 1, styler))
    levelCurrent--;

can be simplified as (not in the patch)

levelCurrent += IsCommentLine(lineCurrent + 1, styler) - IsCommentLine(lineCurrent - 1, styler);
zufuliu commented 5 months ago

Simplified backtrack based on LexerPerl::Fold(). Bash-Fold-0113.patch

LexerBaan::Fold() may contains a bug, initStyle is not updated after backtrack.

zufuliu commented 5 months ago

Correct initStyle after backtrack. Bash-Fold-0113-2.patch

nyamatongwe commented 5 months ago

It does not hide bug in the code, just make code after some portion (e.g. after error recovered) looks good.

It would be better to strongly indicate where folding goes negative, with a new visual symbol in the folding margin, to guide the user to fixing the error.

It's not obvious what language zufuliu/notepad2#745 is in.

zufuliu commented 5 months ago

It's not obvious what language zufuliu/notepad2#745 is in.

Most likely JavaScript, but it applies to other lexer with brace or word based code folding. Here is common scenario that will make braces unbalanced on typing: add } else if () { to existing attached style of if-else chain:

function f1() {
    if () {
        // code
    } else {
        // code
    }
}

function f2() {
}
nyamatongwe commented 5 months ago

Committed the change to the bash lexer. In two parts as the conversion from if to switch does not change behaviour and it was a bigger change. This separation of behavioural changes from 'tidyings' appears to be gaining popularity. https://henrikwarne.com/2024/01/10/tidy-first/ With Kent Beck behind it, I expect this to become a fad like 'extreme programming' with conferences and automated tools.

Most likely JavaScript

It doesn't look like the cpp lexer as it folds over a group of line comments which is a feature sensitive to changes in the group with the resulting need to move folding back. The cpp lexer supports folding stream comments and explicit marker comments but not groups of line comments.

Issues should contain enough information to be reproducible.

rdipardo commented 5 months ago

This separation of behavioural changes from 'tidyings' appears to be gaining popularity.

From the book review it sounds more like the self-help I've seen dispensed to sufferers of writer's block; namely, always leave some trivial task unfinished so you start feeling productive as soon as you return to complete it.

It doesn't look like the cpp lexer

I think it really is "[m]ost likely" Notepad2's dedicated JavaScript lexer, which folds line comments forward and back: https://github.com/zufuliu/notepad2/blob/b71af2f18599562dc4767324c5c8e030c3039d8b/scintilla/lexers/LexJavaScript.cxx#L680-L681

zufuliu commented 5 months ago

Issues should contain enough information to be reproducible.

SciTE with fold.symbols=3, adding } else if:

void f1() {
    if () {
        // code 
    } else if
    } else {
        // code
    }
}

void f2() {
}

void f2() {
}

image

zufuliu commented 5 months ago

The ugly display bug in Notepad2 is my failure (after change GetFoldParent() to treat line with FoldLevel::Base has no parent, https://sourceforge.net/p/scintilla/feature-requests/1444/).

Bug version Notepad2 (negative folding level with GetFoldParent() change): Notepad2-Bug

Old version Notepad2 (negative folding level without GetFoldParent() change): Notepad2-Old

SciTE current (negative folding level), same as old Notepad2: SciTE-Current

Notepad2 current (non-negative folding level with GetFoldParent() change), looks better than SciTE current: Notepad2-Current

nyamatongwe commented 5 months ago

There are two paths here: keeping negative levels and discarding negative levels. I see advantage to keeping and you see advantage to discarding. The currently proposed discarding approach requires changes to each lexer then understanding and cooperation from lexer authors and maintainers.

An alternative implementation would move the discard choice to lexer-independent code as a mode that could be selected.

With discard-negative-fold mode, for negative lines:

The delta would be used for some calls but there would also be access to the raw fold levels. The delta could be recomputed during operations, such as the margin drawing code if it finds negative levels. It could also be stored as a simple sorted [(line start, delta)] vector that is updated when fold levels are set. The vector is likely to be very short with often just a single, quickly fixed, unbalanced structure.

zufuliu commented 5 months ago

I'm sorry, but I don't get how the delta approach works, just feel change lexer would reduce document changes caused by Document::SetLevel().

Also TCL lexer has strange SetLevel(): https://github.com/ScintillaOrg/lexilla/blob/16f8011e58aca6dcff0a4579d63017e4217d2823/lexers/LexTCL.cxx#L205-L210 image

nyamatongwe commented 5 months ago

TCL lexer has strange SetLevel()

Its tracking a boolean comment 'level' as bit mask 0b1 separably from fold level bit mask 0b111111111110.

nyamatongwe commented 5 months ago

It may be simpler to implement the deltas-avoiding-negative-levels approach by storing each relatively negative line rather than when there is an increase from a negative level as that requires no access to later lines. So for

}
}
{
   a;

There would be accumulated deltas [0:1, 1:2] so the{ line (line 2) would have a stored level of -2 and an interpreted level of 0. The a; line would have a stored level of -1 and interpreted level of 1. This shows in the margin with a + and | for the last two lines and an unbalanced indicator (maybe a }) on the first two lines.

Possible negative fold indicator: NegativeFoldIndicator

An issue with any special handling of negative fold levels is that many languages have large structures over almost all code boosting the base fold level: C++ has namespaces and header guards; HTML has html and body. Other languages put code within module declarations or nest methods within classes. These mean that actual negative levels are less common and the approach proposed by this issue less helpful.

zufuliu commented 5 months ago

the approach proposed by this issue less helpful.

Yes, I just see this. it only helps languages with top level structure (e.g. C functions).

Small change to fix potential truly negative number for SetLevel() for Ruby () and TCL lexers:

diff --git a/lexers/LexRuby.cxx b/lexers/LexRuby.cxx
index 8ac792d1..cf08e221 100644
--- a/lexers/LexRuby.cxx
+++ b/lexers/LexRuby.cxx
@@ -1985,12 +1985,12 @@ void FoldRbDoc(Sci_PositionU startPos, Sci_Position length, int initStyle, WordL
             }
         }
         if (atEOL || (i == endPos - 1)) {
-            int lev = levelPrev;
+            int lev = levelPrev + SC_FOLDLEVELBASE;
             if (visibleChars == 0 && foldCompact)
                 lev |= SC_FOLDLEVELWHITEFLAG;
             if ((levelCurrent > levelPrev) && (visibleChars > 0))
                 lev |= SC_FOLDLEVELHEADERFLAG;
-            styler.SetLevel(lineCurrent, lev|SC_FOLDLEVELBASE);
+            styler.SetLevel(lineCurrent, lev);
             lineCurrent++;
             levelPrev = levelCurrent;
             visibleChars = 0;
diff --git a/lexers/LexTCL.cxx b/lexers/LexTCL.cxx
index 5e34aea3..da64df57 100644
--- a/lexers/LexTCL.cxx
+++ b/lexers/LexTCL.cxx
@@ -315,7 +315,9 @@ next:
                case '}':
                    sc.SetState(SCE_TCL_OPERATOR);
                    expected = true;
-                   --currentLevel;
+                   if (currentLevel > 0) {
+                       --currentLevel;
+                   }
                    break;
                case '[':
                    expected = true;

TCL-Ruby-Fold-0121.patch

image

zufuliu commented 5 months ago

full remove power-of-two assumption for SC_FOLDLEVELBASE from Ruby lexer:

@@ -1841,9 +1841,10 @@ void FoldRbDoc(Sci_PositionU startPos, Sci_Position length, int initStyle, WordL
     const Sci_PositionU endPos = startPos + length;
     int visibleChars = 0;
     Sci_Position lineCurrent = styler.GetLine(startPos);
-    int levelPrev = startPos == 0 ? 0 : (styler.LevelAt(lineCurrent)
-                                         & SC_FOLDLEVELNUMBERMASK
-                                         & ~SC_FOLDLEVELBASE);
+    int levelPrev = 0;
+    if (startPos > 0) {
+        levelPrev = (styler.LevelAt(lineCurrent) & SC_FOLDLEVELNUMBERMASK) - SC_FOLDLEVELBASE;
+    }
     int levelCurrent = levelPrev;
     char chPrev = '\0';
     char chNext = styler[startPos];

TCL-Ruby-Fold-0121-2.patch

nyamatongwe commented 5 months ago

A Scintilla patch to show unbalanced folds. The patch hard-codes the appearance but a complete implementation would allow choosing different markers.

diff -r 20064c7d69d1 src/MarginView.cxx
--- a/src/MarginView.cxx    Thu Jan 18 12:41:55 2024 +1100
+++ b/src/MarginView.cxx    Mon Jan 22 09:51:33 2024 +1100
@@ -323,6 +323,7 @@

        bool headWithTail = false;
        bool isExpanded = false;
+       bool isUnbalanced = false;

        if (marginStyle.ShowsFolding()) {
            // Decide which fold indicator should be displayed
@@ -359,6 +360,7 @@
                    needWhiteClosure = LevelIsWhitespace(levelNext);
                }
            }
+           isUnbalanced = (levelNum > levelNextNum) && (levelNextNum < FoldLevel::Base);
        }

        const PRectangle rcMarker(
@@ -456,6 +458,15 @@
            }
        }

+       if (isUnbalanced) {
+           const XYPOSITION widthBrace = surface->WidthText(vs.styles[StyleBraceBad].font.get(), "}");
+           surface->AlphaRectangle(rcMarker, 3.0, FillStroke(ColourRGBA(0xFF, 0x00, 0x00, 0x20), ColourRGBA(0xFF, 0x00, 0x00, 0xFF)));
+           PRectangle rcBrace = rcMarker;
+           rcBrace.Move(rcMarker.Width()/2.0 - widthBrace/2.0, 0);
+           surface->DrawTextTransparent(rcBrace, vs.styles[StyleBraceBad].font.get(),
+               rcBrace.top + vs.maxAscent-1, "}", ColourRGBA(0xFF000000));
+       }
+
        visibleLine++;
        yposScreen += vs.lineHeight;
    }

IndicateUnbalancedFold.patch

zufuliu commented 5 months ago

A new flag (e.g. val SC_FOLDLEVELNEGATIVEFLAG=0x4000) can be added, when lexer discarding negative levels, it set it.

isUnbalanced = FlagSet(leve, FoldLevel::Negative) || ((levelNum > levelNextNum) && (levelNextNum < FoldLevel::Base));
nyamatongwe commented 5 months ago

Reporting the negative levels (or deepening of negative level) through a flag would help both the margin view code and applications but I do not think this should be coded into lexers. It's a generic feature that can be implemented in Scintilla.

It was a mistake that SC_FOLDLEVELHEADERFLAG is set by each lexer instead of being set automatically based on levels. Unfortunately there may be issues with changing this, there are some special cases, and it requires knowing the level of the next line before the flag can be set.

zufuliu commented 5 months ago

Sorry, I still not get how your delta approach works. For Notepad2, as most lexers are written by myself, change lexer is the quickest fix.