Closed joelnitta closed 3 years ago
This is the expected output. Compare with the sample.md
file and scroll down in the created docx: there is a second page (and a third).
Please reopen if there really isn't more content, then also include the software used to open the file.
D'oh! You're totally correct, sorry for not actually scrolling through the output. My bad.
I realized the problem is not converting markdown to docx but latex to docx:
Here is an example latex file sample_latex.tex
:
\documentclass{article}
\begin{document}
this is the first page
\newpage
and this is the second page
\end{document}
When I run pandoc --to docx --lua-filter=pagebreak.lua -o sample_latex.docx sample_latex.tex
,
this is the output in MS Word:
MS Word v 16.44 on Mac OSX 10.15.7
pandoc v 2.11.3.2
Ah yes, that makes sense. Pandoc drops unknown TeX commands when reading LaTeX. You can still get the expected result by adding --from=latex+raw_tex
.
Thanks! That fixed it.
If you don't mind another question, I'm still a little confused though... what is the difference between --from=latex
and --from=latex+raw_tex
? I don't understand why one would need to "extend" latex since \newpage
is already standard latex.
Yes it's standard latex, but because \newpage
doesn't correspond to anything in the pandoc AST, pandoc can ignore it or pass it through as raw tex. It only does the latter if you explicitly enable raw_tex
, which isn't on by default for latex input.
I see, thanks for the explanation.
I tried converting the sample md file using the pagebreak filter.
The resulting docx file has no page break:
Tried with both pandoc 2.7.3 and 2.11.3.2.