Open ronaldtse opened 2 years ago
@andrew2net @ronaldtse What is the format for input data for RelatonNist::NistBibliography.search
or RelatonNist::NistBibliography.get
?
It's not new NIST PubID
at the moment, should it be?
In the specs I found input could be:
NISTIR 8200
SP 500-304
SP 800-189(PD)
SP 800-67r1
NIST SP 800-67 Rev. 1
NIST SP 800-57pt1r4
NIST IR 8011v4
In README.adoc I found you can use only code:
RelatonNist::NistBibliography.get("8200", "2018", {})
[relaton-nist] ("8200") fetching...
[relaton-nist] ("8200") found NISTIR 8200
=> #<RelatonNist::NistBibliographicItem:0x007fc06aa2b480
Should we keep it all the same way? (Then I need to write parser from this format to NIST PubID)
Also, I'm struggling with specifications for docidentifier
format for JSON data from
https://csrc.nist.gov/CSRC/media/feeds/metanorma/pubs-export.zip (relaton-nist using it to return bibliographic data).
Is it old NIST PubID specification? Where I could get it?
For example docidentifier
there could be like SP 800-189 (Draft)
, as @ronaldtse mentioned before (https://github.com/metanorma/nist-pubid/issues/15#issuecomment-986607980) Draft in old NIST PubID
version is a Final Public Draft
in the new version.
Depending on our decision, what we will use as input (new NIST PubID
or old or something else) there could be several ways how to convert input to docidentifier
to search through JSON data.
How it works now:
SP 800-189(PD) -> SP 800-189 (Draft)
With new NIST PubID as input:
NIST SP 800-189(FPD) -> SP 800-189 (Draft)
With updated data:
SP 800-189(PD) -> SP 800-189 (Final Public Draft)
With new NIST PubID as input and updated data:
SP 800-189(FPD) -> SP 800-189 (Final Public Draft)
BTW, @ronaldtse what is "Retired Draft" in the new NIST PubID
?
We use the same code to find "Retired Draft" as for "Draft":
SP 800-189(PD) -> SP 800-189 (Draft)
SP 800-80(PD) -> SP 800-80 (Retired Draft)
Why?
@andrew2net @ronaldtse What is the format for input data for
RelatonNist::NistBibliography.search
orRelatonNist::NistBibliography.get
? It's not newNIST PubID
at the moment, should it be?
It should support the old format, and also support PubID.
In the specs I found input could be:
NISTIR 8200 SP 500-304 SP 800-189(PD) SP 800-67r1 NIST SP 800-67 Rev. 1 NIST SP 800-57pt1r4 NIST IR 8011v4
In README.adoc I found you can use only code:
RelatonNist::NistBibliography.get("8200", "2018", {}) [relaton-nist] ("8200") fetching... [relaton-nist] ("8200") found NISTIR 8200 => #<RelatonNist::NistBibliographicItem:0x007fc06aa2b480
The README only provides one sample, it's not representative of the patterns supported.
Should we keep it all the same way? (Then I need to write parser from this format to NIST PubID)
Yes.
Also, I'm struggling with specifications for
docidentifier
format for JSON data from https://csrc.nist.gov/CSRC/media/feeds/metanorma/pubs-export.zip (relaton-nist using it to return bibliographic data). Is it old NIST PubID specification? Where I could get it? For exampledocidentifier
there could be likeSP 800-189 (Draft)
, as @ronaldtse mentioned before (metanorma/nist-pubid#15 (comment)) Draft in oldNIST PubID
version is aFinal Public Draft
in the new version.
pubs-export.zip uses pre-PubID identifiers. The NIST PubID is not yet active at NIST.
We will still need to support all documents in the pubs-export.zip
Depending on our decision, what we will use as input (new
NIST PubID
or old or something else) there could be several ways how to convert input todocidentifier
to search through JSON data. How it works now:SP 800-189(PD) -> SP 800-189 (Draft)
With new NIST PubID as input:
NIST SP 800-189(FPD) -> SP 800-189 (Draft)
I'm not sure how this works. How can "FPD" => "Draft"? They are different things.
With updated data:
SP 800-189(PD) -> SP 800-189 (Final Public Draft)
Why is "PD" => "FPD"?
With new NIST PubID as input and updated data:
SP 800-189(FPD) -> SP 800-189 (Final Public Draft)
"FPD" => "Final Public Draft" is only for the longer form of PubID output, right?
In any case, we need to take any (PubID + legacy identifier) input, and produce only PubID output.
BTW, @ronaldtse what is "Retired Draft" in the new
NIST PubID
? We use the same code to find "Retired Draft" as for "Draft":SP 800-189(PD) -> SP 800-189 (Draft) SP 800-80(PD) -> SP 800-80 (Retired Draft)
Why?
I don't fully understand the question. What do you mean by "find"?
relaton-nist uses 2 datasets:
It should support the old format, and also support PubID.
Is old format specifications available anywhere?
The README only provides one sample, it's not representative of the patterns supported.
Should we support search by partial data? (just "8200" instead of "NISTIR 8200")
I'm not sure how this works. How can "FPD" => "Draft"? They are different things.
You mentioned here https://github.com/metanorma/nist-pubid/issues/15#issuecomment-986607980 "PD" is something like "FPD". When I have "(PD)" in the original request, I should look for "(Draft)" in NIST CSRC pubs-export's docidentifier
.
I don't fully understand the question. What do you mean by "find"?
RelatonNist::NistBibliography.get("SP 800-189(PD)", nil, {})
returns document with docidentifier
SP 800-189 (Draft)
RelatonNist::NistBibliography.get("SP 800-80(FPD)", nil, {})
returns document with docidentifier
SP 800-80 (Retired Draft)
The datasets NIST Tech Pubs
nor NIST CSRC pubs-export
don't contain any identifiers with draft stages like IPD/FPD/2PD, only "Draft" which is "(PD)".
And there are some data are missing, for example, we don't have there NIST SP(2PD) 1800-13B
(https://www.nccoe.nist.gov/sites/default/files/legacy-files/psfr-mobile-sso-nist-sp1800-13b-draft-v2.pdf)
Seems we don't have any documents using new NIST PubID there.
So I need a separate parser for this or include legacy parser and converter (PD -> Draft) to nist-pubid.
I doubting if it's the right moment to use NIST PubID parser for relaton-nist while we don't have any publications NIST PubID on datasets.
@ronaldtse Any thoughts on that?
Is old format specifications available anywhere?
There is no particular specification but just a convention. Check the Relaton-NIST code, the pubs-export.zip
file and the NIST Tech Pubs XML file for the patterns used.
Should we support search by partial data? (just "8200" instead of "NISTIR 8200")
Probably not. We should support variants though, e.g. "NISTIR 8200" and "NIST IR 8200".
You mentioned here metanorma/nist-pubid#15 (comment) "PD" is something like "FPD". When I have "(PD)" in the original request, I should look for "(Draft)" in NIST CSRC pubs-export's docidentifier.
This is a major confusion that I need to clarify:
nist-pubid
gem is not only to parse NIST PubIDs. It is to parse ANY NIST document identifier and translate the old document identifier into NIST PubID.relaton-nist
gem is to use the nist-pubid
gem to handle both cases:
Does this explain all the questions above?
"PD" is something like "FPD". When I have "(PD)" in the original request, I should look for "(Draft)" in NIST CSRC pubs-export's docidentifier
What I meant by "PD" (Public Draft) is "like" "FPD" (Final Public Draft) is this:
When I have "(PD)" in the original request, I should look for "(Draft)" in NIST CSRC pubs-export's docidentifier
Yes.
Notice that right now, both of these public data sources
RelatonNist::NistBibliography.get("SP 800-189(PD)", nil, {})
returns document with docidentifier SP 800-189 (Draft)RelatonNist::NistBibliography.get("SP 800-80(FPD)", nil, {})
returns document with docidentifier SP 800-80 (Retired Draft)
In the case of "SP 800-80(FPD)":
Again, right now, all the "Drafts" in the data sources are "PD"s (Public Drafts). There are *NO FPDs, IPDs, 2PDs, ...etc.
NOTE: RelatonNist::NistBibliography.get("SP 800-189(PD)", nil, {})
would certainly be easier if it was just RelatonNist::NistBibliography.get("SP 800-189(PD)")
.
@mico the prefixes like ISO, NIST, etc are used in the relaton
gem to route requests to appropriate gem (relaton-iso
, relaton-nist
, etc). The relaton-nist
ignores the NIST prefix in references.
@ronaldtse currently we download the CRSC file on local computer and search through it. If we start using pubid-nist as a IDs parser it will slow down the search because the parslet is quite slow. Maybe we need to transform the CRSC to data repository with index similar other relaton-data-* repos, don't we?
@andrew2net I agree that we should have a relaton-data-nist repo. Let's create it based on both CSRC and NIST-Tech-Pubs content. Thanks!
@mico @andrew2net is this task ready? Thanks!
@ronaldtse the issue blocks this.
https://github.com/metanorma/nist-pubid