relaton / relaton-nist

NistBib: retrieve NIST Standards for bibliographic use using the BibliographicItem model
https://www.metanorma.com
MIT License
2 stars 1 forks source link

NIST data in IETF BibXML format missing certain fields #62

Closed ronaldtse closed 2 years ago

ronaldtse commented 2 years ago
  1. The author field in https://github.com/ietf-ribose/bibxml-data-nist/blob/main/data/NIST_CSWP_09102018.xml is missing "initials", "fullname"
       <author initials="K." surname="Phunny" fullname="Knot Phunny"/>
  1. BibXML supports <city>, <region>, <country> elements. In NIST place: Gaithersburg, MD can be understood as:
    • Gaithersburg => city
    • MD => Region
    • United States of America => Country
ronaldtse commented 2 years ago
  1. Also, the <anchor> element now uses an internal document identifier, but we need it to be the BibXML style anchor because it will be directly consumed by xml2rfc, as described here: https://github.com/ietf-ribose/bibxml-service/issues/7

The pattern should follow this dataset: http://xml2rfc.tools.ietf.org/public/rfc/bibxml-nist/

  1. Missing <seriesInfo> lines from http://xml2rfc.tools.ietf.org/public/rfc/bibxml-nist-new/reference.NIST.CSWP.09102018.xml

This is the full details of that file:

<reference anchor="NIST.CSWP.09102018" target="https://nvlpubs.nist.gov/nistpubs/CSWP/NIST.CSWP.09102018.pdf">
<front>
<title>
Transitioning to the Security Content Automation Protocol (SCAP) Version 2
</title>
<author initials="David" surname="Waltermire" fullname="David Waltermire">
<organization>Information Technology Laboratory</organization>
</author>
<author initials="Jessica" surname="Fitzgerald-McKay" fullname="Jessica Fitzgerald-McKay">
<organization>Information Technology Laboratory</organization>
</author>
<date year="2018" month="September"/>
</front>
<seriesInfo name="NIST" value="NIST CSWP 09102018"/>
<seriesInfo name="DOI" value="10.6028/NIST.CSWP.09102018"/>
</reference>
andrew2net commented 2 years ago

5. Also, the <anchor> element now uses an internal document identifier, but we need it to be the BibXML style anchor because it will be directly consumed by xml2rfc, as described here: Update BibXML service API and OpenAPI definition to support legacy paths ietf-ribose/bibxml-service#7

The pattern should follow this dataset: http://xml2rfc.tools.ietf.org/public/rfc/bibxml-nist/

@ronaldtse the <anchor> in the document is exactly as in the original publication. The <anchor> is produced from publisher_item/item_number or publisher_item/identifier. Maybe for some document the production isn't correct. Have you seen such cases? I noticed the DOI contains <anchor>. For example, we can easily extract the anchor from the DOI 10.6028/NIST.CSWP.09102018. What do you think?

andrew2net commented 2 years ago

This is the full details of that file:

<reference anchor="NIST.CSWP.09102018" target="https://nvlpubs.nist.gov/nistpubs/CSWP/NIST.CSWP.09102018.pdf">
<front>
<title>
Transitioning to the Security Content Automation Protocol (SCAP) Version 2
</title>
<author initials="David" surname="Waltermire" fullname="David Waltermire">
<organization>Information Technology Laboratory</organization>
</author>
<author initials="Jessica" surname="Fitzgerald-McKay" fullname="Jessica Fitzgerald-McKay">
<organization>Information Technology Laboratory</organization>
</author>
<date year="2018" month="September"/>
</front>
<seriesInfo name="NIST" value="NIST CSWP 09102018"/>
<seriesInfo name="DOI" value="10.6028/NIST.CSWP.09102018"/>
</reference>

@ronaldtse should we reproduce exactly the same content? There isn't <city>, <region>, and <country> in the example, so if we implement the first point of this issue then we won't get the same content.

ronaldtse commented 2 years ago

Maybe for some document the production isn't correct. Have you seen such cases?

in the BibXML service, there will be legacy retrievals that will require us to return new data but a legacy anchor. Therefore the anchors will be dynamically filled in by the BibXML service.

I noticed the DOI contains . For example, we can easily extract the anchor from the DOI 10.6028/NIST.CSWP.09102018. What do you think?

Why do we need the anchor from the DOI

There isn't <city>, <region>, and <country> in the example, so if we implement the first point of this issue then we won't get the same content.

We need to implement this change. We don’t want the same content — we want the full and correct content.

andrew2net commented 2 years ago

Why do we need the anchor from the DOI

@ronaldtse the data source doesn't contain <anchor>'s. We need to create it from available information.

ronaldtse commented 2 years ago

@andrew2net ah right. Yes the DOI is acceptable. In the future, we should use the nist-pubid to create the anchor and the filename (the PubID in machine-readable form) (cc @mico).