rouge-ruby / rouge

A pure Ruby code highlighter that is compatible with Pygments
https://rouge.jneen.net/
Other
3.34k stars 738 forks source link

lexer: c: Highlight #includes as Comment::PreprocFile #1770

Closed cdown closed 2 years ago

cdown commented 2 years ago

When changing https://chrisdown.name over from Pygments to Rouge, one quite noticeable change was that {% highlight c %} blocks lost the delineation between #include statements and the actual include, with them all now just being stylised as a comment.

This patch brings back Comment::PreprocFile support for C-like languages, as it was with Pygments.

Tested locally by rebuilding chrisdown.name with a gem from this commit, and seeing that the Comment::PreprocFile highlighting reappeared.

cdown commented 2 years ago

Example change as seen on https://chrisdown.name/2020/01/13/1195725856-and-friends-the-origins-of-mysterious-numbers.html

Pygments:

2021-12-01_135426_599100737

Rouge HEAD:

2021-12-01_135541_105729703

Rouge after this PR:

2021-12-01_135605_112637722

cdown commented 2 years ago

Hey folks, just checking if you have any thoughts on this? Right now I'm running a patched build of Rouge, but especially since you folks just did a release it would be great to get back on mainline if possible :-)

Tagging some folks from PRs I see reviewed recently in the hope that someone has an opinion, sorry to bother. @tancnle @dblessing

tancnle commented 2 years ago

@cdown Thank you for your contribution ❤️ The change looks good overall. I have a look at the current support from Pygments and wonder if we could reuse the vetted pattern. Pygments seems to support the following patterns.

#include <string.h> /* this is a comment */
#include <string.h> // this is a comment

# include <string.h> 
#/* this is a comment */ include <string.h> 
#include  /* this is a comment */  <string.h> 

For this PR, I would suggest at least supporting the top 2 with trailing comments. The change in this PR will display comments as errors so we might need to adjust it slighly.

Screen Shot 2021-12-27 at 3 12 41 pm

Implementation wise, I'd suggest

  1. Use include mixin in the macro state.
diff --git lib/rouge/lexers/c.rb lib/rouge/lexers/c.rb
index 4df87a4f..ad6624a2 100644
--- lib/rouge/lexers/c.rb
+++ lib/rouge/lexers/c.rb
@@ -66,7 +66,6 @@ module Rouge
         mixin :inline_whitespace

         rule %r/#if\s0/, Comment, :if_0
-        rule %r/#include\s*/, Comment::Preproc, :include
         rule %r/#/, Comment::Preproc, :macro

         rule(//) { pop! }
@@ -171,18 +170,22 @@ module Rouge
       end

       state :macro do
-        # NB: pop! goes back to :bol
-        rule %r/\n/, Comment::Preproc, :pop!
+        mixin :include
         rule %r([^/\n\\]+), Comment::Preproc
         rule %r/\\./m, Comment::Preproc
         mixin :inline_whitespace
         rule %r(/), Comment::Preproc
+        # NB: pop! goes back to :bol
+        rule %r/\n/, Comment::Preproc, :pop!
       end

       state :include do
-        rule %r/\n/, Comment::PreprocFile, :pop!
-        rule %r/<[^\n>]+>/, Comment::PreprocFile
-        rule %r/"[^\n"]+"/, Comment::PreprocFile
+        rule %r/(include)(\s*)(<[^>]+>)([^\n]*)/ do
+          groups Comment::Preproc, Text, Comment::PreprocFile, Comment::Single
+        end
+        rule %r/(include)(\s*)("[^"]+")([^\n]*)/ do
+          groups Comment::Preproc, Text, Comment::PreprocFile, Comment::Single
+        end
       end

       state :if_0 do
  1. Add examples with trailing comments in spec/visual/samples/c
    #include <string.h> /* this is a comment */
    #include <string.h> // this is a comment

I am not a C programmer and would appreciate any feedback 🙏🏼

cdown commented 2 years ago

@tancnle Thank you for the kind review and detailed suggestions! I've made the requested changes, and have also added and visually verified a couple of other valid parsing cases to spec/visual/samples/c.

tancnle commented 2 years ago

Thank you for your prompt response @cdown 👍🏼

cdown commented 2 years ago

You too!