rrthomas / hpmor

PDF, ePUB and Mobi versions of “Harry Potter and the Methods of Rationality”, from LaTeX source
http://hpmor.com/
295 stars 55 forks source link

Check LaTeX syntax upon PR creation #163

Closed entorb closed 4 months ago

entorb commented 5 months ago

Hi @rrthomas I thought about adding a LaTeX syntax check to the GitHub Action, that runs upon PR creation.

I tried xelatex -halt-on-error -no-pdf hpmor.tex But this failed with ! Package fontspec Error: The font "Alegreya-Regular" cannot be found.

I assume that is because of the magic in the latexmkrc file.

What do you think about this idea? Do you think this check could run much faster than a compilation of the full pdf (that requires multiple iterations)?

rrthomas commented 5 months ago

Because of TeX's macro-based processing, there is indeed no way to check the syntax in general without fully processing a file.

So, the only reason not to produce the PDF would be if some aspect of writing the PDF took a long time. I see in the man page for latexmk that it by default writes .xdv files from xelatex on intermediate runs, as it says that can be faster when there are large image files to go in the PDF. If that is true in our case, it would seem to be a better way to run xelatex every time.

I did some experiments, and I'm not entirely sure what's going on, I think because our latexmkrc is relatively complex. Also, we use the pdflatex setting, which I think is probably confusing matters, rather than xelatex. Probably we should be running in xelatex mode, not pdflatex mode (my fault, it seems: we've been doing this since the start).

So I suggest:

  1. Use xelatex mode, not pdflatex mode. I have made a small PR for this.
  2. Try adding the -no-pdf -halt-on-error flags and seeing if that is significantly faster when there's no error. Then add a trivial error in the last chapter and try again. If that doesn't give a significant speed-up, then I suggest there's little point in checking without output.
rrthomas commented 5 months ago

(A less conservative approach might be to add a trivial error half-way through: do we get a significant speed-up on average.)

entorb commented 5 months ago

Oh, looks like a none-trivial task to perform a LaTeX syntax check.

The GitHub Action of building all pdfs currently takes 13min. If there would be a kind of "dry-run" parameter that performs the build only once, instead of 3 loops (so ignoring references, toc, etc.), I assume the build time could be reduced to 5 minutes. But I agree, that this might not be worth the effort.

I had hoped for a <1min syntax check, similar to the check_chapters.py script.

Regarding the xelatex mode I do not have a preference.

rrthomas commented 5 months ago

Waiting 5 minutes versus 15 for a syntax check: doesn't the build process stop early anyway if there's an error the first time around?

entorb commented 5 months ago

I tried your PR #164, works well, but I did not observe a performance increase compared to the main branch. Nevertheless, I suggest merging it, as it seems to be the proper way.

rrthomas commented 5 months ago

Will do, thanks. Not surprised that there's no performance increase.

entorb commented 5 months ago

As an alternative approach to the speedup idea, I added caching of the LaTeX temp. and output files. This reduces the re-compiling time at release creating or script modification from 12.5 to 9 min. see da307ba7884241a51fee18b5a37308d34d8376d5 and 9200b7adf25d26113738f4cdc3dbb302fd94399c GitHub Action cache is deleted if have not accessed for 7 days.

rrthomas commented 5 months ago

Looks good! Not a huge improvement, but might as well have it now you've done the work.