IACR / latex-submit

Web server to receive uploaded LaTeX and execute it in a docker container.
GNU Affero General Public License v3.0
11 stars 0 forks source link

automate notifying author of errors #13

Closed kmccurley closed 11 months ago

kmccurley commented 1 year ago

Even when a paper compiles, there are still things that might require an author's attention. I flag some of these like missing references, but others might go unnoticed or not be worth flagging. We can create additional issues to handle subcases here.

missing references (bibliographic or otherwise) We flag these in tasks.py::run_latex_task by looking for LaTeX Warning: There were undefined references in the log.

duplicate labels These are flagged by LaTeX Warning: There were multiply-defined labels. We now catch these in tasks.py

overfull \hbox We now flag these by looking for Overfull \\hbox in the log, but we might want to ignore these if the severity is low.

underfull \hbox I'm not currently doing anything about these.

missing fonts These are tricky and may justify an independent issue. I saw one instance of

LaTeX Font Warning: Font shape `TU/lmr/bx/sc' undefined                                              
(Font)              using `TU/lmr/bx/n' instead on input line 1011.

This is not really an error - it's just a warning. I saw another in eurocrypt2022/146 that said:

LaTeX Font Warning: Command \scriptsize invalid in math mode on input line 73.

This one is unclear to me. I am most concerned with lualatex just dropping characters if it can't find a font.

warnings from packages.

Some packages like hyperref are just chatty with their warnings, saying for example that a token is not allowed in a PDF string (for example when we use \LaTeX\ in a section heading). The silence package says that these follow a pattern:

Package <Name of the package> Warning: <message text>

but the presence does not really indicate that anything is really wrong - it's informational from the package.

Missing number, treated as zero. This comes from TeX itself.

kmccurley commented 1 year ago

The generic LaTeX warnings are in latex.ltx and include things like

\@latex@warning@no@line{##2 has been converted to Blank ##3e}

and \@latex@warning can be found by looking in the log for LaTeX Warning:. As mentioned in the thread, it's not clear which of these are important to us but we'd like to automate catching as many of these as possible to reduce the burden on copy editing.

kmccurley commented 1 year ago

The LaTeX companion has appendix B that attempts to catalog all LaTeX and TeX warnings. One of the things I found there is

Missing character: there is no <char> in font <name>!

We should catch this, but it will require using re.match() on the log to find the missing character. I also found that Underfull \vbox is possible, but I've never seen one.

kmccurley commented 11 months ago

There is now a latex log parser in webapp/log_parser.py that generates metadata/db_models::CompileError objects that are stored in the database. These are shown to authors and can be used to copy editors to raise issues. Some are escalated to fatal errors that require correction even if the LaTeX compiles (e.g., missing bibtex entry), but some are simply warnings to the author (e.g., overfull \hbox). image