relaton / relaton-nist

NistBib: retrieve NIST Standards for bibliographic use using the BibliographicItem model
https://www.metanorma.com
MIT License
2 stars 1 forks source link

Switch to using NEW JSON feed for NIST SPs and FIPSs #20

Closed ronaldtse closed 5 years ago

ronaldtse commented 5 years ago

Thanks to the excellent work at the CSRC, they have a new bibliographic feed made for Metanorma in the Relaton-like format. This feed is updated daily.

The feed content also differs from the CSRC search site -- it is a full superset of it -- e.g. it provides all drafts with the correct statuses.

Metadata

https://csrc.nist.gov/CSRC/media/feeds/metanorma/pubs-export.meta

lastModifiedDate:2019-05-28T17:38:56.4608084-04:00
size: 793461
zipSize: 67416
sha256: C4A6AC0974E24F8F9CB6E864AFD28B6542050037112021732249363AF4DE064D

This describes attributes of the ZIP file.

JSON data

https://csrc.nist.gov/CSRC/media/feeds/metanorma/pubs-export.zip

This is the JSON file that contains bibliographic data of all NIST SPs and FIPSs, at only 793K it is way better than just scraping the search site.

Here's a sample entry.

  {
    "language": "en",
    "script": "Latn",
    "series": "nist-sp",
    "docnumber": "800-116",
    "docidentifier": "SP 800-116",
    "revision": null,
    "edition": null,
    "volume": null,
    "uri": "https://csrc.nist.gov/publications/detail/sp/800-116/archive/2008-11-20",
    "doi": "10.6028/NIST.SP.800-116",
    "title-main": "A Recommendation for the Use of PIV Credentials in Physical Access Control Systems (PACS)",
    "title-sub": null,
    "iteration": null,
    "issued-date": "2008-11",
    "updated-date": null,
    "published-date": "2008-11-20",
    "obsoleted-date": "2018-06-29",
    "status": "final",
    "substage": "withdrawn",
    "authors": [
      {
        "title": null,
        "givenName": "William",
        "middlename": "I.",
        "surname": "MacGregor",
        "suffix": null,
        "nickname": "Bill",
        "affiliation": {
          "name": "National Institute of Standards and Technology",
          "acronym": "NIST"
        },
        "fullname": "William I. MacGregor"
      },
      {
        "title": null,
        "givenName": "Ketan",
        "middlename": "L.",
        "surname": "Mehta",
        "suffix": null,
        "nickname": null,
        "affiliation": {
          "name": "Mehta",
          "acronym": null
        },
        "fullname": "Ketan L. Mehta"
      },
      {
        "title": "Dr.",
        "givenName": "David",
        "middlename": "A.",
        "surname": "Cooper",
        "suffix": null,
        "nickname": null,
        "affiliation": {
          "name": "National Institute of Standards and Technology",
          "acronym": "NIST"
        },
        "fullname": "Dr. David A. Cooper"
      },
      {
        "title": null,
        "givenName": "Karen",
        "middlename": "A.",
        "surname": "Scarfone",
        "suffix": null,
        "nickname": null,
        "affiliation": {
          "name": "National Institute of Standards and Technology",
          "acronym": "NIST"
        },
        "fullname": "Karen A. Scarfone"
      }
    ],
    "editors": [

    ],
    "supersedes": [

    ],
    "superseded-by": [
      {
        "uri": "https://csrc.nist.gov/publications/detail/sp/800-116/rev-1/final",
        "docidentifier": "SP 800-116 Rev. 1"
      }
    ],
    "keywords": [
      "HSPD-12",
      "PIV",
      "PACS",
      "FIPS 201",
      "PIV authentication mechanisms",
      "Smart Card"
    ],
    "comment-from": null,
    "comment-to": null
  }

Actions

Thanks!

opoudjis commented 5 years ago

Please wait until I review the NIST JSON in #23 before proceeding with this.

opoudjis commented 5 years ago

Review in #23 done. There some non-critical information missing, but I think mapping this to Relaton XML is going to be straightforward. This can proceed.

andrew2net commented 5 years ago

@ronaldtse should we include the JSON file into this gem? If so would be the JSNO file updated from time to time?

UPD Oh, I see it's updated daily. How do we suppose to update the file locally? We can check if the creation date of the file is less than today then we load new file. In this case, we shouldn't save the file in the repository. Is this ok?

ronaldtse commented 5 years ago

We should:

  1. Cache the full JSON file globally (
  2. Save the “document info” we want in local/global Relaton cache, just like normal relaton items
andrew2net commented 5 years ago

Cache the full JSON file globally

What do you mean?

ronaldtse commented 5 years ago

Cache full JSON, for example, store the JSON in ~/.relaton/nist/csrc.json, so subsequent fetches in the same day won't need to re-download.

andrew2net commented 5 years ago

@ronaldtse in the JSON file we don't have:

Is it ok if we don't have this information in relaton XML?

ronaldtse commented 5 years ago

Yes, with the URI and DOI we don’t need the PDF link for now.

For abstract and history, they may be added later to this JSON. Thanks!

andrew2net commented 5 years ago

@ronaldtse how should we map parts of name from JSON to bib model?

title => prefix
givenName => forename
middlename => ?
surname => surname
suffix => addition
nickname => ?
ronaldtse commented 5 years ago

@andrew2net currently we don't have a superset Relaton model. We should somehow preserve this data for when Relaton "upgrades" its way to handle contributor information properly.