Closed nyamatongwe closed 3 months ago
I'm using https://github.com/zufuliu/notepad4/blob/main/scintilla/lexlib/WordList.h#L17
enum KeywordAttr {
KeywordAttr_Default = 0,
KeywordAttr_MakeLower = 1,
KeywordAttr_PreSorted = 2,
};
implemented at https://github.com/zufuliu/notepad4/blob/main/scintilla/lexlib/WordList.cxx#L127 and https://github.com/zufuliu/notepad4/blob/main/scintilla/lexlib/WordList.cxx#L139
implemented at
Message::SetKeyWords
can use changed to use lower 8-bit (or 16-bit) to store index and rest bits to store attributes.
https://github.com/zufuliu/notepad4/blob/main/scintilla/src/ScintillaBase.cxx#L1120
case Message::SetKeyWords:
DocumentLexState()->SetWordList(wParam & 0xff,
static_cast<int>(wParam >> 8), ConstCharPtrFromSPtr(lParam));
break;
By specifying case correction in the API, applications need to know which lexers and keyword lists are case-insensitive. It's better to have lexers fix the case since they know when that is needed.
It's better to have lexers fix the case since they know when that is needed.
This only works for newer lexers (those directly inherited from DefaultLexer
or ILexer5
).
@zufuliu This only works for newer lexers
Older lexers could be provided with a WordList::ConvertToLowerCase
to convert an existing WordList
to lower case (and avoid doing this for each lex with a member variable).
However, the most commonly used lexers are object lexers. Case standardization can be implemented incrementally in each lexer over time.
OK, most old style lexers are not maintained/updated in recent years.
Library support for this committed along with use in the HTML lexer. Also includes test cases for HTML.
Some languages are case-insensitive, treating keywords like 'if' and 'IF' as equivalent. This is implemented by lower-casing words found in the document and checking if they are in a keyword list. This is faster than performing a case-insensitive search in a keyword list.
Keyword lists are often defined or modified by a user and it is easy to incorrectly add an upper-case keyword. These will never match a lower-cased word so will not highlight correctly.
This problem could be avoided by lower-casing case-insensitive keyword lists. This would be implemented in variants of WordList::Set and SubStyles::SetIdentifiers or with an optional argument to the existing methods.