PeerJ / jats-conversion

Conversion and validation for JATS XML
MIT License
51 stars 29 forks source link

Should internal DOI's follow the Mandatory and recommended encoding? #148

Closed michaelstoner closed 6 years ago

michaelstoner commented 6 years ago

When converting JATS to CrossRef a DOI such as 10.7717/peerj.2114/table-1]includes a character [ that's recommend to be encode.

So when converting DOI's should we obey the Mandatory and recommended encoding for DOI deposit and URLs? This is relevant to "declaring" new DOI's as apposed to referencing existing DOIs.

The encoding is listed here http://www.doi.org/doi_handbook/2_Numbering.html#2.5.2.4

<table-wrap id="table-1">
               <object-id pub-id-type="doi">10.7717/peerj.2114/table-1]</object-id><label>Table 1</label><caption>
                  <title>Pre-feedback descriptives of participants.</title>
               </caption>

I'm going to branch and commit a proposed solution note this will allow the % symbol with must be escaped, as it's the escaped character, and without updating the project to XSLT 2.0 and therefore allowing regex I think it's a little difficult to detect that all occurrences of '% are related to encoded characters.

michaelstoner commented 6 years ago

The consensus is that this is really much better suited to a schematron check rather than an XSLT rule. Similar issues will occur in many places, and even for this small issue the current solution is not a complete fix.