Closed rdipardo closed 1 year ago
SetLineState
in Scintilla allows setting lines past the end to make it simpler to implement lexers. TestLexers
is a bit stricter to help uncover problems but its probably reasonable to allow writing just one past the maximum line. Possibly
@@ -196,6 +196,9 @@ int SCI_METHOD TestDocument::GetLineState(Sci_Position line) const {
}
int SCI_METHOD TestDocument::SetLineState(Sci_Position line, int state) {
+ if (line == static_cast<Sci_Position>(lineLevels.size())) {
+ return 0;
+ }
return lineStates.at(line) = state;
}
or
@@ -123,7 +123,7 @@ void TestDocument::Set(std::string_view sv) {
if (lineStarts.back() != Length()) {
lineStarts.push_back(Length());
}
- lineStates.resize(lineStarts.size());
+ lineStates.resize(lineStarts.size() + 1);
lineLevels.resize(lineStarts.size(), 0x400);
}
Amending the TestDocument module does the trick.
After applying the second option (since they're equivalent and single exit points are ideal), we're left with the following:
Lexing verilog/AllStyles.vh
~/lexilla/test/examples/verilog/AllStyles.vh:66: different styles between \r and \n at 870: 11, 0
~/lexilla/test/examples/verilog/AllStyles.vh:67: different styles between \r and \n at 879: 11, 0
~/lexilla/test/examples/verilog/AllStyles.vh:68: different styles between \r and \n at 888: 11, 0
~/lexilla/test/examples/verilog/AllStyles.vh:69: different styles between \r and \n at 896: 11, 0
~/lexilla/test/examples/verilog/AllStyles.vh:70: different styles between \r and \n at 905: 11, 0
~/lexilla/test/examples/verilog/AllStyles.vh:1: has different styles with \n versus \r\n line ends
Modernizing all cases of StyleContext::atLineEnd
to ::MatchLineEnd()
resolves most of these, leaving a single issue of unclean style switching in the middle of DOS EOLs in the case of SCE_V_STRINGEOL
. Because StyleContext::ForwardSetState()
will only advance the width of a single character, a conditional call to ::Forward()
is needed to span the entire CRLF:
The diagnostic is not very revealing:
Lexing verilog/AllStyles.vh
~/lexilla/test/examples/verilog/AllStyles.vh:73: different styles between \r and \n at 937: 12, 0
~/lexilla/test/examples/verilog/AllStyles.vh:1: has different styles with \n versus \r\n line ends
This was one case where visual inspection was more productive. In LF mode, SCE_V_STRINGEOL
would fill the line; in CRLF mode, it was trimmed to the length of the string + 1 for the severed CR
.
0002-LexerVerilog-Don-t-split-CRLFs-when-terminating-string-EOL-state.patch.txt
After looking into #204, I found that
LexerVerilog::Lex
increments the line counter beyond the end of the document before passing it toLexAccessor::SetLineState
:https://github.com/ScintillaOrg/lexilla/blob/12348921861f02130d77beea9b1666531d29081d/lexers/LexVerilog.cxx#L475-L482
TestLexers
will abort when the lexer tries to access the last line + 1, even if the input is a zero-length buffer:This patch enables testing by adding a range check and modernizing one occurrence ofStyleContext::atLineEnd
to::MatchLineEnd()
:0001-LexerVerilog-Avoid-incrementing-the-current-line-past-the-end.patch.txtThese tests currently fail because of DOS EOL splitting (which global usage of
::MatchLineEnd()
does not resolve), but they should be good enough to probe what's happening:LexerVerilog-prelim-tests.zip