GeSHi / geshi-1.1

Next generation of Generic Syntax Highlighter for PHP
http://qbnz.com/highlighter/
67 stars 13 forks source link

Token-based highlighting #14

Closed ghost closed 4 years ago

ghost commented 4 years ago

Token IDs in element titles could actually match Textmate. Then Sublime and VSC themes could easily be ported over.

CSS classes are implemented by splitting any slash-delimited token titles, (php/php/oodynamic), filtering out any token IDs equal to the language name (php), then adding the remaining IDs as classes.

Styling would work >90% of the time with an overall stylesheet.

PS: if you are using 1.0 Geshi and would like modern dark syntax highlighting, I made a nord-geshi stylesheet

ghost commented 4 years ago

@BenBE @cweiske @reedy CC into this issue

BenBE commented 4 years ago

Regarding the CSS: The overall theming is done via themes (see the geshi/themes folder). Building a dark theme based on that should be easy to follow.

The title attribute in the generated HTML is mostly a debugging aid.

Overall: What would be good is if we could leverage all those existing language files present for the GeSHi 1.0; as those follow a quite easy format you could probably convert most of them quite easily.

ghost commented 4 years ago

As GeSHi 1.1 gets more languages, people will be obliged to update their custom syntax coloring.

Textmate's theming targets token names and wouldn't require different files for each and every language. It's a quicker way to implement syntax colouring in languages.

The current approach is a very very legacy way to implement this kind of theming IMHO.

Moving towards GeSHi 1.0's theming implementation would also be a unfortunate choice IMO

BenBE commented 4 years ago

I'm not quite happy with the current implementation either as it still has too many parts written in code instead of (static) configuration data that handles things in a portable and future proof way where only minimalistic overrides need to be done per language.

The language files in GeSHi 1.1 already somewhat encourage to use common naming of groups with the option to build some hierarchy out of them.

If I understand your request right you are suggesting to rephrase class names in the generated HTML from lang_theme_token to lang theme token so you could better match CSS against them just providing classes for .theme.token instead of the full triplet?

ghost commented 4 years ago

Yes. Also combining theme files for all languages into a single php file:

Start of themes/default.php, based on C file:

// normal code context

// could format the next 2 differently
$this->setStyle('keyword.control'                      , 'color:#080;'                  );
$this->setStyle('punctuation'                          , 'color:#080;'                  );
$this->setStyle('punctuation.line.continuation'        , 'font-weight:bold;'            );
// declarator-keyword and typeorqualifier tokens appear in the same context so
// identical formatting makes sense
$this->setStyle('storage.type'                         , 'color:#a1a100;'               );
$this->setStyle('support.function'                     , 'color:#b06cc8;'               );
$this->setStyle('support.function.macro'               , 'color:purple;'                );
$this->setStyle('comment.multi'                        , 'color:#888;font-style:italic;');
$this->setStyle('comment.single'                       , 'color:#888;font-style:italic;');
$this->setStyle('constant.numeric.double'              , 'color:#cc66cc;'               );
// a character constant is an int, so identical formatting makes sense
$this->setStyle('constant.numeric.integer'             , 'color:#33f;'                  );
$this->setStyle('character.constant'                   , 'color:#33f;'                  );
// there's probably no advantage in colouring escapes differently here...
$this->setStyle('character.constant.escape'            , 'color:#33f;'                  );
$this->setStyle('character.constant.wide'              , 'color:#21c5f7;'               );
// ...or here...
$this->setStyle('character.constant.escape.wide'       , 'color:#21c5f7;'               );
$this->setStyle('string.quoted'                        , 'color:#cd853f;'               );
// ...but in string literals it's handy
$this->setStyle('character.constant.escape.string'     , 'color:#754b24;'               );
$this->setStyle('string.wide'                          , 'color:#cd661d;'               );
$this->setStyle('character.constant.escape.string.wide', 'color:#754b24;'               );

Example HTML output - can be easily styled in one stylesheet and have overrides implemented:

<div id="code c" name="code"><pre>
<span class="keyword control directive">#</span><span class="keyword control directive">include</span> <span class="punctuation definition string lt-gt begin">&lt;</span><span class="string quoted lt-gt include">stdio.h</span><span title="punctuation definition string lt-gt end">&gt;</span>

<span class="storage type">int</span> <span class="entity name function">main</span><span class="punctuation">(</span><span class="punctuation">)</span>
<span class="punctuation">{</span>
    <span title="entity name function">printf</span><span class="punctuation">(</span><span class="punctuation definition string begin">"</span><span class="string quoted">Hello World</span><span class="character constant escape string">\n</span><span title="punctuation definition string end">"</span><span class="punctuation">)</span><span class="punctuation">;</span>
    <span class="keyword control return">return</span> <span class="constant numeric integer">0</span><span class="punctuation">;</span>
<span title="punctuation">}</span>
</pre></div>

Rules don't need to inherit each other - they override each other purely because of CSS specificity and order

zm-cttae-archive commented 1 year ago

FTR this was closed because shikijs/shiki exists
But maybe this token splitting -> CSS class technique can be used to create themes in an easier way..