Closed ijmitch closed 9 months ago
My original listing page was with type: grid
but I simplified it for the steps to reproduce above. I just checked that adding type: grid
to the boilerplate index.qmd
results in the same problem.
I guess in most cases it's reasonable that listings as used, for example, for blogs are not valid for books, but then that makes the statement that books "support all of the same features as websites" questionable.
In my case, the type: grid
was only on the homepage to provide a more grand way to get to the three major starting points for different types of consumer of the website.
I wondered if I could suppress the listing by giving it an id
and putting that inside some conditional content:
---
listing:
id: contents-listing
type: grid
---
# Preface {.unnumbered}
This is a Quarto book.
::: {.content-visible when-format="html"}
::: {#contents-listing}
:::
:::
To learn more about Quarto books visit <https://quarto.org/docs/books>.
but that still produces a broken PDF.
I guess in most cases it's reasonable that listings as used, for example, for blogs are not valid for books, but then that makes the statement that books "support all of the same features as websites" questionable.
Websites are not $\LaTeX$. The features shared are obviously shared by the format, meaning html. Quarto never stated for listing to work in non HTML documents. Even you quoted the documentation stating "HTML book". Also the listing feature is documented in "website" and nowhere else.
So, if you want to use that, use conditional content to not include an HTML feature in $\LaTeX$ based documents. See https://quarto.org/docs/authoring/conditional.html.
I am going ahead and closing this, as it is not a bug but rather a misuse of the feature depending of the formats.
I'm new to Quarto. I was very happy with some content as a website (originally rendered with Docusaurus), but we identified that some users might benefit from the entire thing as a PDF or EPUB, so I was just hoping that converting to a book would be a reasonable means to achieve that.
I missed the point that 'HTML books' are not guaranteed to be PDF-able.
I perfectly understand the use case but many HTML features will never come to PDF because the tools are too much different, thus the conditional content feature which allow user to use $\LaTeX$ or HTML specific features without compromising the compatibility of the documents to both formats.
Be that as it may, Quarto should strive to never generate malformed content.
A few options:
listing
, but that's a hard thing to do in general. listing: default
is not supported in PDF.Thank you @cscheid - I was intending to come back this morning and appeal that simply getting a bad PDF without warning is pretty unhelpful.
I also found #5782 touches on some of the same things.
You might also see that I did try to get the listing content within a ::: {.content-visible when-format="html"}
div which I believe anticipated what @mcanouil was suggesting, but the way I approached it didn't solve the problem of the bad PDF - perhaps I was taking a wrong approach there?
perhaps I was taking a wrong approach there?
What you did is completely reasonable, but doesn't solve the bug: the content-visible
feature doesn't work quite as well as you'd hope for.
Two things happen when you add listings: the listing contents itself, and all of the HTML dependencies of the listing contents that we have to add. The content-visible
technique you tried only removes the former (and you have no way to know that the latter is there.)
One can argue that the "HTML detritus" of the listings feature sticking around when content-visible
removes the only listing in the PDF document is actually the bug. But that is honestly a very hard bug for us to fix in general with how our code is setup right now, and so the best we can do is to not allow the document to arrive at such a state to begin with.
In other SSG's (well, mainly Mkdocs) before I recently landed here (and was greatly impressed with the maturity of the approach) I'd resorted to simply adding a script with suitable Pandoc commands to generate whichever set of source files into a PDF or EPUB, so I'm not averse to reverting to that.
I'm trying to wean people off even thinking about needing such things for the use case I have, so was hoping not to put any/much effort into this - but I certainly appreciate the discussion here.
Be that as it may, Quarto should strive to never generate malformed content.
A few options:
At the very least, we should silently not emit listing output where it would break the document.
Ideally, we'd support
listing
, but that's a hard thing to do in general.I'd be happy if we warned that
listing: default
is not supported in PDF.
Then the issue is way more general than PDF/EPUB as Typst, Word, etc are likely to be affected as well, and possibly other HTML-based features in Quarto behave like listings.
Being a novice here (at least with respect to Quarto, if not SSGs, structuring of source projects, YAML etc), might I just describe what might be my ideal solution?
I'd be very happy to stay with type: website
but then have an ability to define a composite PDF including multiple .qmd
files to be rendered in the output. My first delight with Quarto was PDF under the 'other formats' for a page, but some people seem to want a larger single document.
The way of specifying the .qmd
files to include in the output could even be similar to how listings work for HTML. So rather than assume all the content is present in the PDF for a type: book
project, just allow a 'pdf listing' to specify some pdf output. Obviously then I could exclude the file with the offending HTML listing content.
I have no idea if that's easier to contemplate within the constraints of the implementation... it's just a suggestion.
I'd be very happy to stay with
type: website
but then have an ability to define a composite PDF including multiple .qmd files to be rendered in the output.
We'd like to support that as well, but "composite PDF" formats is, as of today, just "book through the LaTeX toolchain". That format is stricter than type: website
, and so we can't simply make that "just work".
I'll note that if you use a "book" project, you still get a "website" and you can get a PDF composite. Your website, of course, will have some limitations in the format, but many books have been written and published this way: https://quarto.org/docs/books/index.html
@cscheid - that's fine, and it was the temptation saying that "ah, if people really want a pdf of the collection of material then that's really a book" which was the beginning of this, but I ran myself into the wall of the differences between the types. I will stick with type: website
and work harder to tell the consumers "this is not the PDF you're looking for" ;-)
Sorry if I'm labouring a point here, but I just changed the _quarto.yml
of the book boilerplate project to type: website
and with quarto render index.qmd
where that file has:
---
listing:
id: contents-listing
type: grid
---
# Preface {.unnumbered}
This is a Quarto book.
To learn more about Quarto books visit <https://quarto.org/docs/books>.
this gives me an index.pdf
where it's simply ignored the listing - which is more reasonable than the broken pdf with type: book
. It's a shame book projects trying to put multiple source content into one pdf can't do the same.
Thank you. As you might not have noticed, the issue is open as bug and assigned, this means the team understood the issue and will resolve it in due time. Now you just have to wait and be patient.
Thanks for the detailed reporting - I'm sorry to say that when I try to reproduce the issue, I'm not able to (as I would expect, knowing how listings work). Listing process is tied directly to HTML output - when a PDF is being generated no code runs which processes listings, so they are ignored.
I confirmed this by creating a default book project and the modifying the index.qmd to contain a listing of various formats that you suggested. In all cases, I was able to properly render and view a pdf
file.
I've attached a zip of my attempt to reproduce the issue (including the outputs). If you are able to consistently reproduce this issue, it would be great if you could provide the complete reproducible case so I can test locally - it will be very unexpected if listing content is ending upon in PDF output!
@dragonstyle ... well, this is interesting/embarrassing - I unzipped your archive and checked my Mac could open the generated PDF and all was fine.
However if I rebuild it with quarto render
then the new PDF is broken as I experienced before which provoked this issue.
So the problem is with my Mac's stack of software... what I don't know. Surely this would just be down to my version of Quarto and TinyTex?
Does:
Rendering PDF
running xelatex - 1
This is XeTeX, Version 3.141592653-2.6-0.999995 (TeX Live 2023) (preloaded format=xelatex)
restricted \write18 enabled.
entering extended mode
running xelatex - 2
This is XeTeX, Version 3.141592653-2.6-0.999995 (TeX Live 2023) (preloaded format=xelatex)
restricted \write18 enabled.
entering extended mode
tell you anything suspicious?
You could try to use Quarto to remove and reinstall TinyTex.
quarto remove tinytex
(in theory) and quarto install tinytex
.
That log output seems completely fine - You might want to try a simple document as a PDF and see whether that works (e.g. standalone document like):
---
title: Hello World
format: pdf
---
## Hello World
This is a PDF
If that renders fine, then perhaps there is something about books triggering this that differs between our environments.
One other useful thing to do would be to include the option keep-tex
under your pdf
format, which will keep the generate LaTeX file around. This would let us see if something sketchy/unexpected is showing up in the LaTeX output which might give us a clue.
Generally if we were making invalid LaTeX, I would expect an error while rendering the LaTeX to PDF (for example if we mixed HTML and LaTeX), so this is unusual for sure. Can you share one of these broken pdfs and we can see if I can open it?
I can confirm that on my machine the first is broken (though not in some obvious way, sadly), and the second is not...
Could you share the exact project that is reproducing this? Alternatively, can you trying using keep-tex
to keep the intermediary LaTeX and share that?
I've been doing all this this evening just modifying the copy of the project which you sent me in the book.zip earlier.
I'm perplexed. When I render that tex
file to a pdf using xelatex B.tex
I end up with a valid pdf file.
I can't explain how the presence of the listing
key is causing this :( - listing processing is hidden behind a check of the output format (being HTML). The tex
file doesn't really have any evidence of listings being processed either, so I am really at a loss.
We must be looking in the wrong place.... here are good-B.tex
and broken-B.tex
each from a quarto render
, one producing a good PDF and the other a broken PDF. They are identical, I think.
Archive.zip
If the .tex files are actually the same, there must be something else downstream of that which breaks the PDF.
But I have also, belatedly, noticed that the broken quarto render
emits these WARNING
s:
❯ quarto render
[1/4] index.qmd
WARNING: File /Users/ijmitch/Downloads/b/intro.qmd in the listing 'contents-listing' contains no metadata.
WARNING: File /Users/ijmitch/Downloads/b/summary.qmd in the listing 'contents-listing' contains no metadata.
WARNING: File /Users/ijmitch/Downloads/b/references.qmd in the listing 'contents-listing' contains no metadata.
[2/4] intro.qmd
[3/4] summary.qmd
[4/4] references.qmd
whereas, perhaps obviously, the good case where there's no presentation of front-matter with listing
does not.
And, yes - xelatex broken-B.tex
gives me a good PDF too!
This ZIP has a B.pdf
which is the broken result from quarto render
and broken-B.pdf
which is actually GOOD since it came from xelatex broken-B.tex
. You can see they are quite considerably different in size. I've tried a file compare in VSCode, but I can't tell where the variations in the PDF contents is significant. Certainly there's lots in common and all the variation is in binary data inside a subset of the stream
objects.
May I suggest to use a Git repository instead of many zip archives? At least you get diffs and possibly can use codespaces, etc.
I can reproduce it locally (woo hoo!) - the key is to use the command quarto render
(with no to --pdf
), which mixes HTML and PDF rendering, likely causing the issue somehow. Investigating now... Thanks for your persistence narrowing this down!
Progress!
~I just shoved the project into https://github.com/ijmitch/quarto-book-debug and made it public.~
Ok that was pretty straightforward- there are global
postprocessors that handle last minutes tasks during a project render, and the listing post processors were not expecting that the PDF output would appear in the list of outputs to process (though that is actually expected). I added a check that will filter outputs to only HTML output and that should resolve it.
I'll start a fresh pre-release build and this should be testable within a 10-15 minutes. Once again thx for persistence this was a good that defied my expectations!
@dragonstyle - that's great - many thanks!
hmm... @dragonstyle - did you test this for epubs as well as pdfs?
I didn't text the mixed render case :( - will check now!
I added:
epub:
title: "B"
to _quarto.yml and got an epub out but it didn't open with Macos 'Books' app (whereas other epubs from Quarto without listing
.
Yeah, it was the same problem - I needed the check to be even more strict. A fresh build is on the way!
https://github.com/quarto-dev/quarto-cli/commit/6eedc68cca2e29a56052aa5b0eee946b706c6419
Many thanks!
Bug description
I was turning a website into a book and struggled to understand why Quarto (1.4 as it happens) was producing PDFs and EPUBs which Macos Preview refuses to open with the complaint that the file is broken or an invalid format.
I finally discovered that removing the
listing
options fromindex.qmd
would give me a working PDF and EPUB.Searching got me as far as the Discussion #4266 where it's stated that listings are only supported for websites. But with https://quarto.org/docs/books/ saying:
and the listing working for the html of the website of the book, I think there's at least a doc update needed - I couldn't find any qualification that listings don't work with PDF/EPUBs of books, but also this should perhaps be diagnosed during
quarto render
so people don't scratch there heads as much.Apologies if I've missed a statement in the docs.
Steps to reproduce
I've reproduced by using the boilerplate book project and making the
index.qmd
a listing page thus:Expected behavior
Preview gives me HTML as expected:
Actual behavior
but the PDF is unusable
Your environment
Quarto 1.4 Macos Sonoma 14.1.1
Quarto check output