Open phlummox opened 1 year ago
I suspect this has to do with
+ Use `soul` instead of `ulem` for strikeout, underline (#8411).
This handles things like hyphenation, line breaks, and nonbreaking
spaces better.
Cf #8411 and https://tex.stackexchange.com/questions/160220/french-accents-in-hl-from-soul-package
Note! I don't get the error on my system, using texlive 2023. I think the reason is that the new version of soul
incorporates the old soulutf8
.
https://ctan.math.utah.edu/ctan/tex-archive/macros/generic/soul/soul.pdf
So this problem should go away if you upgrade the soul package in your latex setup.
Hi, thanks for that! For the moment, I actually just rolled back my version of Pandoc to 2.19.2, since that was a quicker fix than upgrading LaTeX. I'm on Ubuntu 20.04, which is supposed to be maintained and updated til 2025, but 2019 is the latest TeX Live version in the Ubuntu 20.04 repositories.
The https://pandoc.org/installing.html page says "We recommend installing TeX Live via your package manager. (On Debian/Ubuntu, apt-get install texlive.)", which is how I installed TeX Live originally. But from what I can tell, looking at https://tug.org/texlive/pkginstall.html, if I want an upgraded version of "soul", I need to install "Native TeX Live" in addition to (or instead of) the version from the Ubuntu repos. I'll see if I can do so, and whether that fixes the problem.
You could probably just put the updated soul.sty in your working directory or local texmf tree.
Thanks for the suggestion - I'll see if that works, too. But as a longer-term solution, if texlive 2023 is the minimum version of texlive that Pandoc requires, I'd like to ensure I have a reliable way of installing it on Linux distributions which don't have it.
Out of interest, if there are automated tests run on the LaTeX writer, what version of TexLive do they use? It might be worth updating the documentation to mention them, if Pandoc is only expected to work with those versions.
An alternative to installing a newer soul
would be to use a custom template that imports soulutf8
instead of soul
.
There are automated tests for the writer, but they just check the LaTeX code it emits; no attempt is made to compile the code using tex. (There are reasons.)
Summary
Attempting to convert the following Markdown to PDF (via LaTex) results in an error:
Likewise for documents containing any of the following Markdown:
[this --- that]{.underline}
[this -- that]{.underline}
["things"]{.underline}
Steps to reproduce
Create
test.md
:Convert it to PDF, via LaTeX:
pandoc -t latex -o test.pdf test.md
Expected behaviour
A PDF document should be generated.
Actual behaviour
The following output is produced:
Result of --verbose
If the Pandoc commands above are run with --verbose, it can be seen that Pandoc is generating quite different LaTeX to what it would produce if not asked to create a PDF.
Given the Markdown
[john's shoes]{.underline}
, the commandwill produce a correct .tex document, containing the code
(And similarly for all the other examples reported above.) If asked to generate a PDF, however, then looking at the temporary .tex file shows that it contains
’
) instead of the correct LaTeX single quote ('
)“
and”
) instead of the correct LaTeX (``
and''
)--
and---
)Given that this just isn't the correct way of writing LaTeX, it's not surprising that problems ensue.
It also makes the reason for the bug harder to spot, since the incorrect LaTeX code which Pandoc is actually using differs from the correct LaTeX code it generates when asked to output a .tex file.
Possible corrections to manual
Currently, the Pandoc manual, here --
https://github.com/jgm/pandoc/blob/6067e477acd933316ba23a1838aafffad872f627/MANUAL.txt#L132
states
But if Pandoc's behaviour when creating PDFs – creating a temporary LaTeX file with "smart" quotes, unicode en-dashes, etc. – is intentional, then this bit of the manual is not correct. It isn't actually useful to look at Pandoc's LaTeX output, because that's not what Pandoc will use internally, and it won't necessarily help you debug pdflatex compilation problems; you should just run with
--verbose
, instead. So this part of the manual might need to be amended.Behaviour of version of Pandoc from last year (2022)
In case it's helpful – I happened to have a copy of Pandoc 2.19.2 on my system, downloaded from https://github.com/jgm/pandoc/releases/download/2.19.2/pandoc-2.19.2-linux-amd64.tar.gz last year. That version doesn't exhibit the bug: it correctly generates a PDF.
So this seems to be a regression in behaviour.
Pandoc version
Latest release (3.1.6.1), installed from .deb file downloaded from the "Releases" page.
Operating system is Ubuntu 20.04.6 (Focal Fossa).
LaTeX version is TeX Live 2019.20200218-1: