jgm / pandoc

Universal markup converter
https://pandoc.org
Other
34.6k stars 3.38k forks source link

texorpdfstring dropping `\mbox{}` #5909

Open rnuske opened 4 years ago

rnuske commented 4 years ago

Would it be possible for the latex writer to keep the contents of \mbox{} in the plain text variant of the \texorpdfstring{orig tex}{plain text} ?

Using pandoc 2.7.3 I get the following

$ pandoc -t latex
# Lorem \mbox{ipsum} dolor sit amet {#Lorem-ipsum}
^D
\hypertarget{Lorem-ipsum}{%
\section{\texorpdfstring{Lorem \mbox{ipsum} dolor sit
amet}{Lorem  dolor sit amet}}\label{Lorem-ipsum}}

but would love to get

\hypertarget{Lorem-ipsum}{%
\section{\texorpdfstring{Lorem \mbox{ipsum} dolor sit
amet}{Lorem ipsum dolor sit amet}}\label{Lorem-ipsum}}

Is there anything I could do differently to get the desired result?

jgm commented 4 years ago

It's parsed from markdown as raw tex, so all the writer knows about it is that it's a bit of raw tex. Given that, it's kind of hard to extract the ipsum. I suppose we could run the LaTeX reader on it first, though I don't usually like to couple the reader and writer in this way.