relaton / relaton-data-nist

1 stars 0 forks source link

Data source issue [NIST Library]: "NBS Report 10581" has an invalid document identifier #2

Open strogonoff opened 2 years ago

strogonoff commented 2 years ago

Example: https://demo.bibxml.org/api/v1/ref/nist/NBS_REPORT_;_10581/ (docid.type is NBS report ; 10581)

ronaldtse commented 2 years ago

The correct name is "NBS Report 10581". Is this a problem with the data source? If so we have to report it back to NIST.

strogonoff commented 2 years ago

From my vantage point, it looks like a potential problem with relaton-data-nist source, thus filed here. Can’t say further than that, since further logic is responsibility of respective GHAs not BibXML service.

ronaldtse commented 2 years ago

This problem originates from the source. We will handle this in metanorma/nist-pubid#20 (which will have a Relaton component).

Extracted portion:

                  <item_number item_number_type="report-number">NBS report ; 10581</item_number>

Full XML:

   <query key="RPT">
      <doi type="report-paper_title">10.6028/NBS.RPT.10581</doi>
      <crm-item name="publisher-name" type="string">National Institute of Standards and Technology (NIST)</crm-item>
      <crm-item name="prefix-name" type="string">National Institute of Standards and Technology</crm-item>
      <crm-item name="member-id" type="number">4068</crm-item>
      <crm-item name="citation-id" type="number">85911338</crm-item>
      <crm-item name="book-id" type="number">2207561</crm-item>
      <crm-item name="deposit-timestamp" type="number">201610201500</crm-item>
      <crm-item name="owner-prefix" type="string">10.6028</crm-item>
      <crm-item name="last-update" type="date">2018-03-06T10:04:59Z</crm-item>
      <crm-item name="created" type="date">2016-10-20T19:40:51Z</crm-item>
      <crm-item name="citedby-count" type="number">0</crm-item>
      <doi_record>
         <report-paper>
            <report-paper_metadata language="en">
               <contributors>
                  <person_name sequence="first" contributor_role="author">
                     <given_name>N W</given_name>
                     <surname>Rupp</surname>
                  </person_name>
               </contributors>
               <titles>
                  <title>Intermediary base and cementation :</title>
                  <subtitle>progress report</subtitle>
               </titles>
               <edition_number>0</edition_number>
               <publication_date media_type="online">
                  <year>1971</year>
               </publication_date>
               <publisher>
                  <publisher_name>National Bureau of Standards</publisher_name>
                  <publisher_place>Gaithersburg, MD</publisher_place>
               </publisher>
               <institution>
                  <institution_name>National Bureau of Standards</institution_name>
                  <institution_acronym>NBS</institution_acronym>
                  <institution_place>Gaithersburg, MD</institution_place>
               </institution>
               <publisher_item>
                  <item_number item_number_type="report-number">NBS report ; 10581</item_number>
               </publisher_item>
               <doi_data>
                  <doi>10.6028/NBS.RPT.10581</doi>
                  <resource>https://nvlpubs.nist.gov/nistpubs/Legacy/RPT/nbsreport10581.pdf</resource>
               </doi_data>
            </report-paper_metadata>
         </report-paper>
      </doi_record>
   </query>
ronaldtse commented 2 years ago

Reported.

ronaldtse commented 2 years ago

@strogonoff if you have other data issues to report please keep them in separate issues. Thanks.

strogonoff commented 2 years ago

@ronaldtse I’m reporting issues that I happen to stumble across. However, if this is the only way we are going to do data cleanup, this will only catch a small minority of issues. As BibXML service exposes more information in GUI and API, service consumers will notice more of them. Should we encourage them to report those and provide relevant links to GitHub?

ronaldtse commented 2 years ago

Should we encourage them to report those and provide relevant links to GitHub?

Absolutely! Good idea, we can make a link for people to report data quality issues.

Keeping separate tickets is a good way for us to track what issues we need to fix with data. Because we need to upstream these requests, we need to keep separate tickets for better management.

strogonoff commented 2 years ago

https://github.com/ietf-ribose/bibxml-service/issues/58