uwdata / living-papers

Authoring tools for scholarly communication. Create interactive web pages or formal research papers from markdown source.
BSD 3-Clause "New" or "Revised" License
129 stars 10 forks source link

citation parser isn't general enough (e.g. `@doi:10.1016/S1045-926X(05)80012-6`) #23

Closed joshuahhh closed 2 years ago

joshuahhh commented 2 years ago

"10.1016/S1045-926X(05)80012-6" is a valid DOI (see https://doi.org/10.1016/S1045-926X(05)80012-6), but when I put @doi:10.1016/S1045-926X(05)80012-6 or [@doi:10.1016/S1045-926X(05)80012-6] in my document, the system says

Citation doi lookup failed: 10.1016/S1045-926X

I suppose the parser doesn't support a broad enough range of characters, so it's stopping at the (?

jheer commented 2 years ago

We currently use Pandoc’s citation reference parsing directly and so inherit its quirks and limitations.

DOI suffixes are very flexible, often more so than the systems in which they are embedded (including browser URLs). This page is relevant: https://www.doi.org/syntax.html

One idea is to allow URL encoding (e.g., %43) and have Living Papers perform URL decoding on citation keys. That will help make “difficult” URIs expressible, though the author experience isn’t the best. Still, that may be an acceptable workaround until a parsing solution is available.

joshuahhh commented 2 years ago

URL encoding is a good idea. It actually works, without any changes, in the HTML output. But TeX hates percent signs, so that doesn't last. I added some code to my branch to decodeURI citation keys ASAP: ee2a7e67.