andras-simonyi / citeproc-el

A CSL 1.0.2 Citation Processor for Emacs.
GNU General Public License v3.0
85 stars 9 forks source link

False html rendering of bibliography with embedded latex #108

Open hatlafax opened 2 years ago

hatlafax commented 2 years ago

Hi,

I am using org-ref for citation handling. I export my org files to html. The following bibtex entry does not render as expected:

@article{PhysRevB.57.1292, title = {Determination of the phase difference between the Raman tensor elements of the ({A}{1g})-like phonons in ({\mathrm{SmBa}}{2}{\mathrm{Cu}}{3}{\mathrm{O}}{7-\delta}) }, author = {Strach, T. and Brunen, J. and Lederle, B. and Zegenhagen, J. and Cardona, M.}, journal = {Phys. Rev. B}, volume = {57}, issue = {2}, pages = {1292--1297}, numpages = {0}, year = {1998}, month = {Jan}, publisher = {American Physical Society}, doi = {10.1103/PhysRevB.57.1292}, url = {https://link.aps.org/doi/10.1103/PhysRevB.57.1292} }

The embedded LaTeX part is not processed correctly.

I use the following minimal org file

+HTML_MATHJAX: align: left indent: 5em tagside: left font: Neo-Euler

+HTML_MATHJAX: cancel.js noErrors.js

+csl-style: apa-5th-edition.csl

+csl-locale: en-US

[[cite:&PhysRevB.57.1292]]

[[bibliography:c:/Users/Joe/Dropbox/emacs/psimacs/emacs/content/bibliography/bibliography.bib]]

The problem is that somehow the curly braces of '({A}{1g})-like phonons in ({\mathrm{SmBa}}{2}{\mathrm{Cu}}{3}{\mathrm{O}}{7-\delta})' are filtered and the resulting string is not recognized by MathJAX anymore.

If I export to org as an intermediate step I can see that the braces are gone:

<>Strach, T., Brunen, J., Lederle, B., Zegenhagen, J., & Cardona, M. (1998). Determination of the phase difference between the raman tensor elements of the (A_1g)-like phonons in (SmBa_2Cu_3O_7-). /Phys. rev. b/, /57/, 1292–1297. American Physical Society. Retrieved from https://link.aps.org/doi/10.1103/PhysRevB.57.1292

Is this a problem of citeproc-el or do I stretch the framework to much? What can I do to export these kinds of bibliography correctly to html?

Any help or comment is appreciated. Best hatlafax

hatlafax commented 2 years ago

Hi, following a hack that I have installed into my configuration in order to circumvent the described problem. What I have done is to advice function citeproc-bt--process-brackets in such a way that I recognize the embedded LaTeX math, replace them with some dummy strings and after running the original citeproc-bt--process-brackets function I restore the original math strings. I now can export directly to LaTeX and I can export to Org buffer and subsequentely export this buffer to HTML with fine math rendering in the resulting documents.

The code: (use-package org-ref ;;:straight nil :after org :init ... (require 'queue)

    (defun psimacs/config/citeproc-bt--process-brackets (fn &rest args)
      "Advice for the citeproc-bt--process-brackets allowing resuing of embedded LaTeX math."
      (let* ((result (car args))
             (lhb    (nth 1 args))
             (rhb    (nth 2 args))
             (formular-rx (rx "\\(" (group (*? anything)) "\\)"))
               (match t)
             (value nil)
             (q (make-queue)))

        (while match
          (cond ((string-match formular-rx result)
                 (queue-enqueue q (match-string 0 result))
                   (setq result (replace-match "<<QUEUED-MATCH>>" t t result)
                     match t))
                  (t (setq match nil))))

        (setq result (apply fn `(,result ,lhb ,rhb)))

        (setq match t)
        (while match
          (cond ((string-match "<<QUEUED-MATCH>>" result)
                 (setq value (queue-dequeue q)
                         result (replace-match (concat lhb value rhb) t t result)
                       match t))
              (t (setq match nil))))
         result)
    )

    (advice-add 'citeproc-bt--process-brackets :around #'psimacs/config/citeproc-bt--process-brackets)

Hope it of help for someone. Best, hatlafax

hatlafax commented 2 years ago

One final remark: I had to adapt the embedded LaTeX math expression in the bibliography a little so that it works. Originally I got the following bibtex title entry: title = {Determination of the phase difference between the Raman tensor elements of the ${A}{1g}$-like phonons in ${\mathrm{SmBa}}{2}{\mathrm{Cu}}{3}{\mathrm{O}}{7\ensuremath{-}\ensuremath{\delta}}$},

Three modifcations are necessary:

That results into the following working bibtex title entry: title = {Determination of the phase difference between the {Raman} tensor elements of the \({{A}{1g}})-like phonons in ({{\mathrm{SmBa}}{2}{\mathrm{Cu}}{3}{\mathrm{O}}{7-\delta}}\)},

This situation is not perfect, however, for my cases it works :-)

andras-simonyi commented 2 years ago

Hello, thanks for reporting and sorry for responding only now! I'm not sure it's possible to come up with a significantly less hackish solution since the CSL standard itself doesn't support embedded LaTeX math formatting -- as far as I can see the most what could be achieved would be supporting superscripts and subscripts (which is included in CSL), which might be enough for a large part of the uses cases though. Of course, one could try to extend CSL to allow embedded LaTeX math but then that would require support in the formatter backends as well which is far from trivial to say the least. However, @bdarcus and @denismaier may be able add something more substantial -- has there been a discussion on supporting embedded (LaTeX) math in CSL?

bdarcus commented 2 years ago

has there been a discussion on supporting embedded (LaTeX) math in CSL?

Yes, but no resolution. It's complicated, given the diverse contexts CSL is used in, including word processors.

See here for discussion, and linked PR.

https://discourse.citationstyles.org/t/rfc-rich-text-for-csl-json-input-format/1672

The essence of the idea is expressed in this schema.

https://github.com/citation-style-language/schema/blob/v1.1/schemas/input/csl-rich-text.yaml

Please weigh in on this proposal, @andras-simonyi; we need input from implementers.