metanorma / pubid-nist

BSD 2-Clause "Simplified" License
1 stars 2 forks source link

NIST Tech Pubs data source issue: "NIST Handbook 105-1 (Rev. 1990)" contains data inconsistencies #128

Open ronaldtse opened 2 years ago

ronaldtse commented 2 years ago

In the current dataset, the "NBS HB 105-1 1990" document is mislabeled in two ways: (source document: https://nvlpubs.nist.gov/nistpubs/Legacy/hb/nisthandbook105-1-1990.pdf)

  1. On the withdrawn cover page, it says "NIST Handbook 105-1, 1990 Edition". In the content, however, the Abbrev PubID says "Rev. 1990".

Withdrawn cover page:

image

Inside document:

image Screenshot 2022-01-27 at 11 17 57 AM
  1. The DOI is now "NBS.HB.105-1r1990", but the document was published by NIST (not NBS).

Proposed resolution:

Originally posted by @ronaldtse in https://github.com/metanorma/nist-pubid/issues/126#issuecomment-1022809459

mico commented 2 years ago

In "Nist Tech Pubs" we have two duplicated document identifiers for this document with different DOIs:

 <publisher_item>
    <item_number item_number_type="report-number">NIST HB 105-1-1990</item_number>
 </publisher_item>
 <doi_data>
    <doi>10.6028/NIST.HB.105-1-1990</doi>
    <resource>https://nvlpubs.nist.gov/nistpubs/Legacy/hb/nisthandbook105-1-1990.pdf</resource>
 </doi_data>

and

 <publisher_item>
    <item_number item_number_type="report-number">NIST HB 105-1-1990</item_number>
 </publisher_item>
 <doi_data>
    <doi>10.6028/NBS.HB.105-1r1990</doi>
    <resource>https://nvlpubs.nist.gov/nistpubs/Legacy/hb/nisthandbook105-1-1990.pdf</resource>
 </doi_data>
ronaldtse commented 2 years ago

@mico when the resource URL is identical, we should ignore the duplicate.