jupyter-book / mystmd

Command line tools for working with MyST Markdown.
https://mystmd.org/guide
MIT License
219 stars 64 forks source link

Less aggressive conversion of `link` to `cite` nodes #1629

Open fwkoch opened 2 weeks ago

fwkoch commented 2 weeks ago

When resolving link nodes, MyST checks to see if they are DOIs and, if so, converts them into cite nodes: https://github.com/jupyter-book/mystmd/blob/main/packages/myst-cli/src/transforms/dois.ts#L235-L239 This enables nice shorthand for defining citations like links: [](https://doi.org/10.0000/abc123)

The DOI matching is quite aggressive, though; if a valid DOI can be pulled out of anywhere in the link, it will become a cite node. For example, I want to reference supplementary material like so: see supplementary material at [this website](https://www.frontiersin.org/articles/10.3389/fonc.2018.00134/full#supplementary-material) - I do not want this to become a citation. However, since the DOI is found in the middle of that link, it does become a citation (and then clicking on the link takes you to whatever page doi.org redirects to, possibly a different website, certainly without the #supplementary-material fragment appended).

We could address this by allowing users to specify they do not want a specific link converted to citation. However, this requires new syntax and extra, unexpected work for users.

I think a better solution would just be to scale back the DOI resolver that looks for a valid DOI anywhere in the link: https://github.com/curvenote/doi-utils/blob/main/src/resolvers.ts#L54-L61 - maybe remove that pathParts resolver entirely? Or maybe just add a few more rules, like there must be no fragment and maybe the DOI must be at the end of the path or something...?