Closed opoudjis closed 3 years ago
The NIST Relaton entry contains a smart em-dash,  — 
. We will need to unsmarten that... except, we wouldn't unsmart unicode content there, like diacritics...
We have constrained <u>
explicitly to children of t blockquote li dd preamble td th annotation
We are currently unsmartening only single and double quote in IETF Metanorma XML. Need to extend to spaces and dashes.
IETF expects dashes to be dumb: Information Processing Systems - Local Area Networks - Part 3: Carrier sense multiple access with collision detection (CSMA/CD) access method and physical layer specifications, 2nd edition
, from https://xml2rfc.tools.ietf.org/public/rfc/bibxml-ieee/reference.IEEE.802-3.1990.xml
You may want to add .gsub(/\u2026/, "...") .gsub(/\u200b/, ""),
to the list.
Good call. Adding.
I also had some issues with "<" since 2.4.2. I fixed that with the following change:
- n.replace(n.text.gsub(/[\u0080-\uffff]/, "<u>\\0</u>"))
+ n.text.gsub!(/[\u0080-\uffff]/, "<u>\\0</u>")
From https://github.com/metanorma/metanorma-ietf/issues/161, we are injecting
<u>
around Unicode characters. That is triggering a syntax error in an instance of[[[SP800-131A,NIST SP 800-131A]]]
. Need to constrain the occurrences ofu
more strictly, and to ensure that there are no smart quotes or dashes in IETF Metanorma XML, including in Relaton content.