rouge-ruby / rouge

A pure Ruby code highlighter that is compatible with Pygments
https://rouge.jneen.net/
Other
3.3k stars 732 forks source link

Invalid doctests causes wrong highlighting in complete code fragment #1991

Open niknetniko opened 9 months ago

niknetniko commented 9 months ago

Name of the lexer python

Code sample A sample of the code that produces the bug (note the missing ' in the return value of the doctest)

def test():
    """
    >>> test()
    'hallo
    """
    return str(5)

Online sample

Version 4.1.3: image Version 4.1.0: image

Additional context Since v4.1.1 (#1932), the Python lexer supports highlight doctests. However, if the doctests are invalid, the invalid highlighting extends beyond the scope of the doctest.

While I would normally think we shouldn't expect great highlighting if the syntax is invalid, I feel like doctests are an exception: the code in question is still valid Python code, as the doctests are comments.

Perhaps we could disable doctest highlighting if an error is found? (No clue if possible in Rouge).

jneen commented 9 months ago

Thanks for the report! I think this was simply implemented incorrectly. This is a use case for recursive highlighting - scan for the docstring first and delegate. This is the proper way to do embedding when the parent language completely controls the extent of the embed.

Here's an example of <script> tags in HTML, where the parent language's </script> ending tag takes precedence over any content of the script itself:

https://github.com/rouge-ruby/rouge/blob/4e475297bb8f094106a4e740eee65b7e51e389b6/lib/rouge/lexers/html.rb#L121