ScintillaOrg / lexilla

A library of language lexers for use with Scintilla
https://www.scintilla.org/Lexilla.html
Other
186 stars 67 forks source link

Convert the Dart lexer to object lexer #270

Closed techee closed 1 month ago

techee commented 2 months ago

This PR converts the Dart lexer to object lexer. I used the VB lexer as the base of this conversion. Two things I'm aware of that might need some improvements:

  1. I wasn't sure what to put inside the lexer class and what to leave outside - I more or less tried to do it in the same way as the VB lexer - i.e. small auxiliary functions outside the class in the unnamed namespace, 2 bigger functions inside the class.
  2. Inside lexicalClasses[] definitions I wasn't sure how exactly the descriptions in the 3rd and 4th column should look like. I tried to have a look at other lexers but I'm not completely sure about the exact rules for the names so this may need some improvements.

Please let me know what needs to be changed.

nyamatongwe commented 2 months ago

The cpp lexer is the best starting point for LexicalClass tags. Using the same tags as other lexers allows applications to choose basic styles for those classes or to search within styles that include comment, for example. The tags are ordered to produce a rough hierarchy.

Commonly languages only have a single list of genuine keywords (which may be treated specially) and the other word lists are sets of identifiers to display differently such as APIs. Therefore, additional word list styles use the "identifier" tag.

Escape sequences are parts of literal strings so should follow cpp:

    27, "SCE_C_ESCAPESEQUENCE", "literal string escapesequence", "Escape sequence",

SCE_DART_NUMBER appears similar numbers in cpp:

    4, "SCE_C_NUMBER", "literal numeric", "Number",

Interpolated states haven't received much attention with the perl lexer probably the most developed. While interpolated elements inside strings may be seen as part of the string, they can also be seen as starting a new nested mode.

zufuliu commented 2 months ago

BacktrackToStart() can be replaced with LexCPP's interpolatingAtEol to avoid backtracking.

techee commented 2 months ago

@nyamatongwe I finally got back to this patch. I've updated the tag names as suggested and also had a look at CPP, Perl, and Python lexers and modified the names to be similar. Please let me know if more changes are needed.

techee commented 2 months ago

BacktrackToStart() can be replaced with LexCPP's interpolatingAtEol to avoid backtracking.

I didn't want to spend much time understanding what's going on so I just left it as it is (if it's not a problem).

I think the backtracking won't be a big problem - it's just to get the initial state so it happens just once at the start and also I don't expect many real world giant interpolated strings where the code would have to backtrack too far.

nyamatongwe commented 2 months ago

The tag changes are likely OK.

I can't retrieve 3d91fb2b161f66e2f783479a8b6634851b4dbe8c with gh pr checkout 270 which makes it difficult to merge.

I was just confused.

nyamatongwe commented 2 months ago

Committed with 75ae9070e93c6476e17bf461855753b6559ca544.