IACR / latex-submit

Web server to receive uploaded LaTeX and execute it in a docker container.
GNU Affero General Public License v3.0
11 stars 0 forks source link

need bibtex parser like log_parser.py to improve error reporting #61

Closed kmccurley closed 6 months ago

kmccurley commented 8 months ago

When an author has a malformed bibtex file, latexmk will fail with an error code, but we don't parse the output to tell the author what the problem is. I think it would help to have a bibtex log parser much like the latex log_parser.py so we can show the authors where their problem is.

kmccurley commented 8 months ago

I have added some capability for this. The code for the BibTexLogParser is in log_parser.py (with tests in log_parser_test.py and testdata/biblogs). I found a javascript parser for bibtex logs, but it only parses a few of the errors from the logs. There is no defined format for the .blg file, so I had to base it on patterns learned form observing log files and the original source for the bibtex binary that is written in the web language. The build process for bibtex apparently goes from web to c and then gets compiled to a binary for the platform. The strings emitted by the bibtex binary are located in the .web file. I have not attempted to capture all error messages from bibtex, and it's not clear what constitutes an error and what constitutes a warning. For example, a \cite that has no corresponding bibtex entry does not generate an "error" from bibtex, but a bad author field with "Too many commas" is counted as an error. There is sometimes a line at the end that says (There were 4 error messages) or (There were 3 warnings) but these are suspect because I sometimes get different counts. It's theoretically possible for errors to be generated by something in the .bst file, but we use a very stable alphaurl style that should not generate errors. There are corroborating error messages from both parsing the .log file of latex and parsing the bibtex itself in meta_parse.py. These can find things like missing references, missing required fields, etc.

kmccurley commented 6 months ago

Closing this as mostly working. Most things should be warnings rather than errors, to follow bibtex behavior.