asciidoctor / asciidoctor-pdf

:page_with_curl: Asciidoctor PDF: A native PDF converter for AsciiDoc based on Asciidoctor and Prawn, written entirely in Ruby.
https://docs.asciidoctor.org/pdf-converter/latest/
MIT License
1.14k stars 500 forks source link

Rouge Source Highlighter Ignores subs="quotes" in PDFs #1615

Closed chuck-confluent closed 2 years ago

chuck-confluent commented 4 years ago

Version info:

Asciidoctor PDF 1.5.3 using Asciidoctor 2.0.10 [https://asciidoctor.org]
Runtime Environment (ruby 2.6.4p104 (2019-08-28 revision 67798) [x86_64-linux-musl]) (lc:UTF-8 fs:UTF-8 in:UTF-8 ex:UTF-8)

Rouge version: 3.17.0

Problem

The rouge syntax highlighter is ignoring quotes substitution in asciidoctor-pdf version 1.5.3, causing bold text to instead be wrapped with <strong></strong>. In previous versions of asciidoctor-pdf, this behavior didn't occur.

Evidence

Here is a test document without the rouge syntax highlighter:

= Test

[source,subs="quotes"]
----
$ echo *"Hello World"*
----

. Clone the source code repository to the folder `confluent-dev` in your *home* directory:
+
[source,subs="verbatim,quotes,attributes"]
----
$ *cd ~*
$ *git clone --depth 1 --branch {course-tag} \
    https://github.com/confluentinc/training-developer-src.git \
    confluent-dev*  
----

Here is the PDF result: bold-test-no-rouge.pdf

Now here is another test document with :syntax-highlighter: rouge:

:source-highlighter: rouge

= Test

[source,subs="quotes"]
----
$ echo *"Hello World"*
----

. Clone the source code repository to the folder `confluent-dev` in your *home* directory:
+
[source,subs="verbatim,quotes,attributes"]
----
$ *cd ~*
$ *git clone --depth 1 --branch {course-tag} \
    https://github.com/confluentinc/training-developer-src.git \
    confluent-dev*  
----

Here is the resulting PDF: bold-test-with-rouge.pdf

Literally the only difference is :source-highlighter: rouge.

HTML

The HTML produced with the asciidoctor command, as opposed to asciidoctor-pdf, is identical between the two versions of the test document. Both correctly bold the source code.

mojavelinux commented 4 years ago

Your observation is correct, but unfortunately these two options are mutually exclusive. The way that Rouge is integrated into PDF, and really the only way it works, is that you can either have source highlighting or custom subs. So if you are using custom subs, then you need to disable the source style on the block.

mojavelinux commented 4 years ago

My recommendation for doing what you are trying to do is to create a custom lexer for Rouge that handles this additional markup. It's a more robust strategy anyway because then you are using the source highlighter as it was intended.

chuck-confluent commented 4 years ago

@mojavelinux Thanks for your response, Dan! I had no idea these were mutually exclusive.

I'm still a little confused why this didn't used to be the behavior. I suppose what was happening was that when I wrote [source,subs="quotes"], I was actually using the incorrect syntax and subs was used instead of source? And now, the default behavior is the other way, with source being respected and subs being disregarded?

chuck-confluent commented 4 years ago

This actually makes something else make sense with an old issue I didn't understand before. With [source,bash,subs="verbatim,quotes"] on a block, I got: image (1)

Now I know why this happened. The bash source highlighting disabled the subs. Cool! I'm gonna close this now.

chuck-confluent commented 4 years ago

@mojavelinux Hey Dan, it occurs to me that this should be documented in the asciidoctor-pdf docs. The regular asciidoctor docs use examples with both source and subs, like here because they are not mutually exclusive when producing html with asciidoctor.

For a user like me who produces PDFs and HTML, it's unfortunate that this behavior isn't consistent, and should probably be clearly indicated in the docs.

mojavelinux commented 4 years ago

If you'd like to take a shot at updating the README, please feel free to do so. It should probably be mentioned in https://github.com/asciidoctor/asciidoctor-pdf#known-limitations and perhaps a section before https://github.com/asciidoctor/asciidoctor-pdf#autofitting-text.

The reality is that not all output formats have the same capabilities, so we need to reinforce the expectation that there will be differences. As a writer using advanced features, you need to understand the capabilities of the publishing system because that's what ultimately publishes the content. That's true whether you're using AsciiDoc or any other writing system.

I continue to advise against using subs for adding formatting to source blocks because no matter how you look at it, it introduces a conflict with the source highlighter. Sometimes, you get lucky and it works, but it's asking for trouble. The correct solution is a custom highlighting language that intelligently adds formatting during the highlighting process.

mojavelinux commented 4 years ago

Btw, subs=+attributes (or subs=attributes+ depending on the use case) is perfectly acceptable.

mojavelinux commented 4 years ago

This example right here shows exactly why it should be a custom highlighting lexer or theme:

$ *cd ~*

You're changing the formatting choice that the highlighter makes. But the highlighter already knows where the command is versus the prompt. So the change should be to the highlighter (perhaps as simple as modifying the theme, but could also be a custom highlighting lexer).

mojavelinux commented 2 years ago

This issue came up again in #2086.

mojavelinux commented 2 years ago

I have added docs to warn against using subs that introduce additional HTML into the output of source blocks. See https://docs.asciidoctor.org/asciidoctor/latest/syntax-highlighting/#custom-subs-on-source-blocks

In #2086, we're exploring a way to automatically suppress this HTML in PDF output when it would otherwise pass through unprocessed. That will allow you to use subs without having to worry about messing up the PDF output. And there's a change certain syntax highlighters might even be able to support it. We'll have to see. Until then, follow that issue.