ropensci / tinkr

Convert (R)Markdown files to XML, edit them, write them back as (R)Markdown
https://docs.ropensci.org/tinkr
GNU General Public License v3.0
57 stars 3 forks source link

protect text inside of anchor link keys #85

Closed zkamvar closed 1 year ago

zkamvar commented 1 year ago

Whilst planning the transition for Carpentries lessons, I ran into a situation where a page with anchor links were being mangled:

txt <- "**Maintainer(s):**

* [Sarah Brown][brown_sarah]: @brownsarahm
* [Tim Dennis][dennis_tim]: @jt14den
* [David Perez-Suarez][perez-suarez_david]: @dpshelio
* [Nathaniel Porter][porter-nathaniel]: @ndporter
* [Jon Wheeler][wheeler_jon]: @jonathanwheeler01
* [Karen Word][word_karen]: @karenword  

[dc-site]: http://datacarpentry.org
[lesson-example]: https://carpentries.github.io/lesson-example
[swc-site]: http://software-carpentry.org
[lc-site]: https://librarycarpentry.org
[koch_christina]: https://carpentries.org/instructors/
[brown_sarah]: https://carpentries.org/instructors/ 
[dennis_tim]: https://carpentries.org/instructors/
[perez-suarez_david]: https://carpentries.org/instructors/
[porter-nathaniel]: https://carpentries.org/instructors/
[wheeler_jon]: https://carpentries.org/instructors/
[word_karen]: https://carpentries.org/team/"
tmp <- tempfile()
writeLines(txt, tmp)
tinkr::yarn$new(tmp)$show()
#> **Maintainer(s):**
#> 
#> - [Sarah Brown][wheeler_jon]: @brownsarahm
#> - [Tim Dennis][wheeler_jon]: @jt14den
#> - [David Perez-Suarez][wheeler_jon]: @dpshelio
#> - [Nathaniel Porter][wheeler_jon]: @ndporter
#> - [Jon Wheeler][wheeler_jon]: @jonathanwheeler01
#> - [Karen Word][word_karen]: @karenword
#> 
#> [wheeler\_jon]: https://carpentries.org/instructors/
#> [word\_karen]: https://carpentries.org/team/

Created on 2023-01-25 with reprex v2.0.2

There are actually two issues here:

  1. anchor links with duplicate links are coalesced into the last one
  2. the underscore character is not protected in the anchor link definition
zkamvar commented 1 year ago

I seem to have forgotten to include relevant context:

https://github.com/carpentries/lesson-transition/issues/15

zkamvar commented 1 year ago

The fix for escaping the anchor links can be achieved by modifying the anchor link template if we modify line 48 to be <xsl:value-of select='string(.)'/>:

https://github.com/ropensci/tinkr/blob/e99436a81796442c84d7192e89a0b734c4ffff84/inst/stylesheets/xml2md_gfm.xsl#L45-L63

zkamvar commented 1 year ago

I believe this was fixed in #86