andras-simonyi / citeproc-el

A CSL 1.0.2 Citation Processor for Emacs.
GNU General Public License v3.0
88 stars 9 forks source link

Fix for latex bibliographies that have urls with '#' in them #166

Closed Risto-Stevcev closed 5 months ago

Risto-Stevcev commented 5 months ago

The latex output wasn't escaping # in urls, which was causing the latex to pdf processor pdflatex to error out. This only seems to arise out of \hypertarget{...}{...\url{...}} sort of formatting, where the url in \url has a # in it. I couldn't workaround this resource with urlencoded urls because the ISO site doesn't like that, so I had to dig a little more to find out how to fix it properly.

Example test.org file:

#+title: Example org file
#+cite_export: csl ./chicago-fullnote-bibliography.csl
#+bibliography: ./references.bib

- Prolog ISO Standard [cite:@iso_prolog]

* Bibliography

#+print_bibliography: t

Example references.bib file:

@techreport{iso_prolog,
  title = {Prolog — Part 1: General core},
  shorttitle = {{ISO}/{IEC} 13211-1:1995},
  url = {https://www.iso.org/obp/ui/#iso:std:iso-iec:13211:-1:ed-1:v1:en},
  language = {en},
  number = {ISO/IEC 13211-1:1995},
  institution = {International Organization for Standardization},
  author = {{ISO Information Technology}},
  year = {1995}
}
Risto-Stevcev commented 5 months ago

I'm not sure if this is a proper fix though, because now I'm getting errors with this earlier url that I urlencoded: https://docs.github.com/en/rest/commits/commits?apiVersion%3D2022-11-28%23list-commits--parameters

which that command escapes as: \hypertarget{citeproc_bib_item_14}{[14] “Github api, list commits.” Available: \url{https://docs.github.com/en/rest/commits/commits?apiVersion\\%3D2022-11-28\\%23list-commits--parameters}}

Which makes pdflatex fail. I'm not familiar enough with this library to know what the right fix is for this.

Risto-Stevcev commented 5 months ago

It looks like it was just the # sign that was problematic, it seems to be default handle %, ?, and & ok, so I updated the code to just escape only the #. Let me know if I can update the code to be more idiomatic for the library.

andras-simonyi commented 5 months ago

Hello, thanks a lot for the bug report and the patch. As there was already a piece of code dealing with escaping the % character in URIs for LaTeX output, I extended it to deal with the # character instead of directly using your code, I hope that isn't a problem. The fix has been merged now (08f988e32fa53dfca363bd3d2b3e0a70f936e3b6) -- hopefully it fixes all URI escaping-related LaTeX output problems.

Risto-Stevcev commented 5 months ago

Ok, thanks