metanorma / mnconvert

Metanorma converter
2 stars 1 forks source link

Add the official NISO STS XML sample conversion #407

Closed Intelligent2013 closed 5 months ago

Intelligent2013 commented 7 months ago

Add the official NISO STS XML https://www.niso-sts.org/downloadables/samples/NISO-STS-Standard-1-0.XML (from the page https://www.niso-sts.org/Samples.html) into ~JUnit tests~ Makefile.

Intelligent2013 commented 7 months ago

Metanorma Adoc: NISO-STS-Standard-1-0.adoc.zip

The issues:

include::sections/02-normrefs.adoc[]

include::sections/06-references.adoc[]

~based on Metanorma requirements, And need to be adapted manually after conversion.~

- [x] from `07-definitions.adoc `:
```adoc
[[sec_7]]
== Definitions

ANSIAmerican National Standards Institute. ANSI is a private nonprofit organization that oversees the development of voluntary consensus standards in the U.S.

ASCIIASCII is a character-encoding scheme based on the English alphabet. It defines the encoding for 128 characters including A-Z, a-z, 0-9, some symbols, and some control characters.

from XML:

        <sec id="sec_7" sec-type="definitions">
            <label>7</label>
            <title>Definitions</title>
            <term-sec>
                <term-display>
                    <term>ANSI</term>
                    <def>
                        <p>American National Standards Institute. ANSI is a private nonprofit organization that oversees the development of voluntary consensus standards in the U.S.</p>
                    </def>
                </term-display>
            </term-sec>
            <term-sec>
                <term-display>
                    <term>ASCII</term>
                    <def>
                        <p>ASCII is a character-encoding scheme based on the English alphabet. It defines the encoding for 128 characters including A-Z, a-z, 0-9, some symbols, and some control characters.</p>
                        <p>It is sometimes used to refer to &#x201C;plain text,&#x201D; i.e., text without special characters or equations.</p>
                    </def>
                </term-display>
            </term-sec>

mnconvert doesn't know about this practice of using term-sec/term-display/term|dec. I'll add the conversion rules.

Metanorma XML: NISO-STS-Standard-1-0.mn.zip

The issues:

Notes:

Intelligent2013 commented 6 months ago
Intelligent2013 commented 6 months ago

@ronaldtse what is the main goal of the conversion 'NISO STS XML tagged version of the standard' into Metanorma Adoc and XML? Do we need round trip conversion?

ronaldtse commented 6 months ago

@Intelligent2013 we want to be able to handle the elements produced in the sample in Metanorma. We don't need a round trip.

Intelligent2013 commented 6 months ago

The attribute :docnumber: Z39.102 causes the error:

bundle exec metanorma -t iso -x presentation test.adoc
Fatal Error: Failed to match sequence (stage:'Fpr'? 'WD/'? (type:GUIDE_PREFIX SPACE)? (stage:STAGE SPACE)? (stage:TYPED_STAGE SPACE)? (ORIGINATOR (SPACE / '/'))? (TC_DOCUMENT_BODY / STD_DOCUMENT_BODY / DIR_DOCUMENT_BODY (' + ' dir_joint_document:(ORIGINATOR SPACE DIR_DOCUMENT_BODY))?)) at line 1 char 1.
cause: Failed to match sequence (stage:'Fpr'? 'WD/'? (type:GUIDE_PREFIX SPACE)? (stage:STAGE SPACE)? (stage:TYPED_STAGE SPACE)? (ORIGINATOR (SPACE / '/'))? (TC_DOCUMENT_BODY / STD_DOCUMENT_BODY / DIR_DOCUMENT_BODY (' + ' dir_joint_document:(ORIGINATOR SPACE DIR_DOCUMENT_BODY))?)) at line 1 char 1.
`- Expected one of [TC_DOCUMENT_BODY, STD_DOCUMENT_BODY, DIR_DOCUMENT_BODY (' + ' dir_joint_document:(ORIGINATOR SPACE DIR_DOCUMENT_BODY))?] at line 1 char 1.
   |- Failed to match sequence ((tctype:TCTYPE '/'?){0, } SPACE tcnumber:DIGITS ('/' ((sctype:SCTYPE SPACE scnumber:DIGITS '/')? wgtype:WGTYPE SPACE wgnumber:DIGITS / sctype:SCTYPE (SPACE / '/' wgtype:WGTYPE SPACE) scnumber:DIGITS))? SPACE 'N' SPACE? number:DIGITS) at line 1 char 1.
   |  `- Expected " ", but got "Z" at line 1 char 1.
   |- Failed to match sequence ((TYPE / stage:STAGE iteration:DIGITS?)? SPACE? ((stage:STAGE / stage:TYPED_STAGE / TYPE) SPACE)? number:DIGITS ('|' joint_document:(publisher:'IDF' SPACE number:DIGITS))? PART? ITERATION? (SPACE? (':' / DASH) YEAR)? SUPPLEMENT? EXTRACT? ADDENDUM? EDITION? LANGUAGE?) at line 1 char 1.
   |  `- Expected at least 1 of \\d at line 1 char 1.
   |     `- Failed to match \\d at line 1 char 1.
   `- Failed to match sequence (DIR_DOCUMENT_BODY (' + ' dir_joint_document:(ORIGINATOR SPACE DIR_DOCUMENT_BODY))?) at line 1 char 1.
      `- Extra input after last repetition at line 1 char 1.
         `- Failed to match sequence (' + ' dir_joint_document:(ORIGINATOR SPACE DIR_DOCUMENT_BODY)) at line 1 char 1.
            `- Expected " + ", but got "Z39" at line 1 char 1.
C:/tools/ruby31/lib/ruby/gems/3.1.0/gems/pubid-core-1.12.5/lib/pubid/core/identifier/base.rb:166:in `rescue in parse': Failed to match sequence (stage:'Fpr'? 'WD/'? (type:GUIDE_PREFIX SPACE)? (stage:STAGE SPACE)? (stage:TYPED_STAGE SPACE)? (ORIGINATOR (SPACE / '/'))? (TC_DOCUMENT_BODY / STD_DOCUMENT_BODY / DIR_DOCUMENT_BODY (' + ' dir_joint_document:(ORIGINATOR SPACE DIR_DOCUMENT_BODY))?)) at line 1 char 1. (Pubid::Core::Errors::ParseError)
cause: Failed to match sequence (stage:'Fpr'? 'WD/'? (type:GUIDE_PREFIX SPACE)? (stage:STAGE SPACE)? (stage:TYPED_STAGE SPACE)? (ORIGINATOR (SPACE / '/'))? (TC_DOCUMENT_BODY / STD_DOCUMENT_BODY / DIR_DOCUMENT_BODY (' + ' dir_joint_document:(ORIGINATOR SPACE DIR_DOCUMENT_BODY))?)) at line 1 char 1.
`- Expected one of [TC_DOCUMENT_BODY, STD_DOCUMENT_BODY, DIR_DOCUMENT_BODY (' + ' dir_joint_document:(ORIGINATOR SPACE DIR_DOCUMENT_BODY))?] at line 1 char 1.
   |- Failed to match sequence ((tctype:TCTYPE '/'?){0, } SPACE tcnumber:DIGITS ('/' ((sctype:SCTYPE SPACE scnumber:DIGITS '/')? wgtype:WGTYPE SPACE wgnumber:DIGITS / sctype:SCTYPE (SPACE / '/' wgtype:WGTYPE SPACE) scnumber:DIGITS))? SPACE 'N' SPACE? number:DIGITS) at line 1 char 1.
   |  `- Expected " ", but got "Z" at line 1 char 1.
   |- Failed to match sequence ((TYPE / stage:STAGE iteration:DIGITS?)? SPACE? ((stage:STAGE / stage:TYPED_STAGE / TYPE) SPACE)? number:DIGITS ('|' joint_document:(publisher:'IDF' SPACE number:DIGITS))? PART? ITERATION? (SPACE? (':' / DASH) YEAR)? SUPPLEMENT? EXTRACT? ADDENDUM? EDITION? LANGUAGE?) at line 1 char 1.
   |  `- Expected at least 1 of \\\\d at line 1 char 1.
   |     `- Failed to match \\\\d at line 1 char 1.
   `- Failed to match sequence (DIR_DOCUMENT_BODY (' + ' dir_joint_document:(ORIGINATOR SPACE DIR_DOCUMENT_BODY))?) at line 1 char 1.
      `- Extra input after last repetition at line 1 char 1.
         `- Failed to match sequence (' + ' dir_joint_document:(ORIGINATOR SPACE DIR_DOCUMENT_BODY)) at line 1 char 1.
            `- Expected " + ", but got "Z39" at line 1 char 1.
        from C:/tools/ruby31/lib/ruby/gems/3.1.0/gems/pubid-core-1.12.5/lib/pubid/core/identifier/base.rb:161:in `parse'
        from C:/tools/ruby31/lib/ruby/gems/3.1.0/gems/metanorma-iso-2.7.6/lib/metanorma/iso/front_id.rb:60:in `orig_id_parse'
        from C:/tools/ruby31/lib/ruby/gems/3.1.0/gems/metanorma-iso-2.7.6/lib/metanorma/iso/front_id.rb:47:in `iso_id_params'
        from C:/tools/ruby31/lib/ruby/gems/3.1.0/gems/metanorma-iso-2.7.6/lib/metanorma/iso/front_id.rb:37:in `iso_id'
        from C:/tools/ruby31/lib/ruby/gems/3.1.0/gems/metanorma-iso-2.7.6/lib/metanorma/iso/front_id.rb:13:in `metadata_id'
        from C:/tools/ruby31/lib/ruby/gems/3.1.0/gems/metanorma-standoc-2.8.7/lib/metanorma/standoc/front.rb:168:in `metadata'
        from C:/tools/ruby31/lib/ruby/gems/3.1.0/gems/metanorma-standoc-2.8.7/lib/metanorma/standoc/base.rb:117:in `block in front'
...
Intelligent2013 commented 6 months ago

[.preface,type=cover-address] == {blank}

Published by the National Information Standards Organization


- [x] wrong bibitem markup in '06-references.adoc`:

[[ref_6]] [%bibitem] === Access License and Indicators, 5 January 2015. https://www.niso.org/apps/group_public/download.php/14226/rp-22-2015_ALI.pdf[https://www.niso.org/apps/group_public/download.php/14226/rp-22-2015_ALI.pdf] docid:: id::: NISO RP-22-2015 type:: standard



- [x] missing `[bibliography]` in '06-references.adoc`
Intelligent2013 commented 6 months ago