jgm / pandoc

Universal markup converter
https://pandoc.org
Other
33.8k stars 3.33k forks source link

Markdown to Mediawiki: escaped asterisk is not escaped #9700

Open vadcx opened 4 months ago

vadcx commented 4 months ago

Explain the problem.

When converting Markdown -> Mediawiki, an escaped asterisk is not escaped in Mediawiki output, causing it to be interpreted as a formatting character in some cases, such as when starting on a line.

I don't know whether it's a problem on Markdown input or Mediawiki output side.

Online pandoc link

Command: pandoc -f markdown -t mediawiki markdown-to-mediawiki-asterisk-bug.md -o -

Input (markdown-to-mediawiki-asterisk-bug.md):

This is a normal sentence with a manual text footnote\*

\* The footnote explains why it couldn't just be inside parentheses

And the text continues. Although the "proper" footnote in Markdown has a different syntax.[^1]

[^1]: There's a reason I can never remember what all these different characters stand for.

End of text.

Output:

This is a normal sentence with a manual text footnote*

* The footnote explains why it couldn’t just be inside parentheses

And the text continues. Although the “proper” footnote in Markdown has a different syntax.<ref>There’s a reason I can never remember what all these different characters stand for.</ref>

End of text.

<references />

Will be displayed on MediaWiki as (it's not supposed to start a list!):


This is a normal sentence with a manual text footnote*

And the text continues. Although the “proper” footnote in Markdown has a different syntax.[1]

End of text.

[1]: There’s a reason I can never remember what all these different characters stand for.


I suppose it should've been enclosed in <nowiki> or something?

Pandoc version?

pandoc 3.1.8 Features: +server +lua Scripting engine: Lua 5.4

jgm commented 4 months ago

It's an issue on the mediawiki writer side. The markdown reader correctly interprets * as a plain string, but the writer needs to add escaping in cases where something might be interpreted as a list.