asciidoctor / asciidoctorj

:coffee: Java bindings for Asciidoctor. Asciidoctor on the JVM!
http://asciidoctor.org
Apache License 2.0
625 stars 173 forks source link

HTML characters are replaced before InlineMacroProcessors are executed. #529

Open mmews-n4 opened 7 years ago

mmews-n4 commented 7 years ago

I am implementing an inline macro and hence extending the InlineMacroProcessor. When the process method is called, the strings in the variables target and attributes differ from the original adoc file. That is, special characters are replaced by their corresponding html equivalents.

Example adoc: mymacro:tgt<string>[1<2]

The greater and less than characters are replaced by '>' and '<', resulting in the following two values: target= tgt&lt;string&gt; and attributes = text=1&lt;2.

The Annotations I use are:

Q1: Why are the html replacements performed before the InlineMacroProcessor is executed?

When I use the pass macro like this: macro:++tgt<string>++[++1<2++], the 'target' and 'attributes' variables only contain '0' and '1' but no text.

Q2: How can I access the original target and attributes values?

mojavelinux commented 5 years ago

You're accurately describing how the AsciiDoc / Asciidoctor parser works. Substitutions are applied to phrasing content one after the other. So by the time the custom inline macro is processed, the following substitutions have been applied: specialcharacters, quotes, attributes, replacements, and some of the macros.

So the macro has to look for &lt; and &gt; instead of < and >. That's as original as it's going to get.

When I use the pass macro like this: macro:++tgt++[++1<2++], the 'target' and 'attributes' variables only contain '0' and '1' but no text.

This is complex because at this time in the parsing, the passthrough content is still stored in placeholders.

If you really want the raw source, it would be necessary to wrap it as follows:

pass:m[macro:tgt<string>[1<2\]]

That way, the macro gets escaped from processing, but then macro substitutions are applied to it before being restored.