Closed GoogleCodeExporter closed 8 years ago
Language tags should be easy to recognize and remember.
Since we use class="prettyprint" to identify regions to prettyprint, I suggest
the
following convention
class="prettyprint" -- make a best guess as to language
class="prettyprint lang-java" -- do java prettyprinting
The "lang-" prefix is followed by the filename extension commonly used for
source
files in that language to avoid problems with C# not being a valid html
identifier.
We will use cc for C++ since it is an identifier, and more commonly used than
cpp or cxx.
Original comment by mikesamuel@gmail.com
on 15 Aug 2007 at 6:57
To flesh out the high level design, the prettify loop will be changed to:
(1) Extract tags and store [tag, position-in-string]
(2) Use a regex based lexer to lex the string sans tags
(3) Run a classifier over tokens
(4) Merge tags back into token list and join tokens to produce html
from the current
(1) Split into chunks of tags | text
(2) Split text chunks into tokens using a state machine over a character
iterator
that unescapes entities lazily
(3) Join token list to produce html
This will cut out the hand coded state machines that iterate over characters,
replacing them with the regex based lexers from 2.
We can then define a language handler as a { lexer, classifier } pair.
Define a language handler for C-style langs and one for markup langs to get us
backwards compatible.
Modify the main prettify function to look for a lang-\w+ class, and, if present,
choose the appropriate lexer.
Implement a lisp/scheme lexer to demonstrate that new handlers can be added and
document.
Implement other lexers as demanded.
Original comment by mikesamuel@gmail.com
on 15 Aug 2007 at 8:43
Finished rewriting the existing lexers to use PR_createSimpleLexer which is
regexp based.
Original comment by mikesamuel@gmail.com
on 31 Aug 2007 at 8:49
I realize this would be an entirely different thing, but what about taking
advantage
of a library of pre-written syntax highlighting rules, like VIM's? The
syntax-defining commands aren't that complicated. (Well, they don't seem to be,
what
do I know?)
Original comment by partda...@gmail.com
on 7 Feb 2008 at 11:34
@r38
Original comment by mikesamuel@gmail.com
on 5 Jul 2008 at 4:04
Original issue reported on code.google.com by
mikesamuel@gmail.com
on 15 Aug 2007 at 6:54