dginev / ar5iv

A web service offering HTML5 articles from arXiv.org as converted with latexml
https://ar5iv.org
MIT License
768 stars 21 forks source link

Improve article 1001.1015 #348

Open jfine2358 opened 1 year ago

jfine2358 commented 1 year ago

Exact location of issue

https://arxiv.org/abs/1001.1015 https://ar5iv.labs.arxiv.org/html/1001.1015 https://ar5iv.labs.arxiv.org/log/1001.1015

This arises from #347. I used the https://ar5iv.labs.arxiv.org/feeling_lucky to retrieve this article.

I report three issues. For clarity, this is for information only. I'm not requesting that these issues be fixed.

  1. Date mismatch.

    • HTML gives (December 15, 2022) as the article date.
    • PDF gives (Dated: April 19, 2022).
    • PDF side banner and abstract page give Mon, 8 Mar 2010.
    • TeX source gives \date{\today}.
  2. HTML shows Figure 1 as a black box.

  3. Use of $$ ... $$ repeated about 20 times in LaTeX source gives rise to a single MathParser error.

    Warning:not_parsed:UNKNOWN.UNKNOWN.CLOSE>OPEN MathParser failed to match rule 'Anything'
     at PREfinal.tex; line 129 col 0 - line 131 col 2
     In "$$(\rho u)[(e+(P_{xx}/\rho)+(u^{2}/2)]_{x}+Q_{x}=$$"
     ([[OPEN]] ρ[[UNKNOWN]] u[[UNKNOWN]] )[[CLOSE]]
      > [[[OPEN]] ([[OPEN]] e[[UNKNOWN]] +[[ADDOP]] ([[OPEN]] P[[UNKNOWN]] _{xx}[[POSTSUBSCRIPT]] /[[MULOP]] ρ[[UNKNOWN]] )[[CLOSE]] +[[ADDOP]] ([[OPEN]] u[[UNKNOWN]] ^{2}[[POSTSUPERSCRIPT]] /[[MULOP]] 2[[NUMBER]] )[[CLOSE]] ][[CLOSE]] _{x}[[POSTSUBSCRIPT]] +[[ADDOP]] Q[[UNKNOWN]] _{x}[[POSTSUBSCRIPT]] =[[RELOP]]

    Further information

The value of the \year, \month, \day can be set via TeX's command line, in a way that overrides the system provided values. This is a way to resolve (1) above.

dginev commented 1 year ago

Thanks for the report!

  1. Indeed, the \date{\today} issue is a general problem with re-typesetting arXiv articles. I have considered completely disabling \today in ar5iv, to avoid this mishap, which is the easiest direction of improvement.

    • It would be a lot more involved to cross-reference the date of submission for a given source and re-binding \today for each individual article. Indeed it is technically possible, as an add-on metadata service that precedes the main conversion pass (both for PDF and HTML).
  2. Figure 1 is a postscript file which appears to have failed to convert with our use of imagemagick.

  3. The quoted expression is indeed not parsed by the current grammar. Upon "zooming in" the details, one spots:

    [(e+(...)+(...)]

    where the leftmost paren before the e is never closed - making the full expression unbalanced/ill-formed.

I will keep the issue open at least until we can resolve the \today question in some way.