relaton / relaton-nist

NistBib: retrieve NIST Standards for bibliographic use using the BibliographicItem model
https://www.metanorma.com
MIT License
2 stars 1 forks source link

Support "short citation" in search syntax #12

Closed ronaldtse closed 5 years ago

ronaldtse commented 5 years ago

In accordance with https://github.com/metanorma/metanorma-nist/issues/95#issuecomment-487120038 , NistBib should support the following search syntax:

Type Short
Undated reference "SP 800-162" or "NIST SP 800-162"
Final without updated-date "SP 800-162 (January 2014)" or "NIST SP 800-162 (January 2014)"
Final where updated-date > original-release-date "SP 800-162 (February 25, 2019)" or "NIST SP 800-162 (February 25, 2019)"
ronaldtse commented 5 years ago

Please also refer to the following issues when parsing the page:

The "short form" is:

opoudjis commented 5 years ago

And if stage is not empty, it is going to be abbreviated, as discussed in https://github.com/metanorma/metanorma-nist/issues/116:

The first iteration of a public draft (<status><stage>draft-public</stage><iteration>1</iteration></status> or just <status><stage>draft-public</stage></status>) is abbreviated IPD.

The final iteration of a public draft (<status><stage>draft-public</stage><iteration>final</iteration></status>) is abbreviated FPD.

The other iterations of a public draft (e.g. <status><stage>draft-public</stage><iteration>3</iteration></status>) are abbreviated 2PD, 3PD, etc.

andrew2net commented 5 years ago

@ronaldtse @opoudjis where we can scrape the iteration number?

andrew2net commented 5 years ago

@ronaldtse does the table https://github.com/metanorma/nistbib/issues/12#issue-437754112 mean we should search:

andrew2net commented 5 years ago

@ronaldtse could a short ref be with stage and without edition? For example SP 800-162 (2PD).

andrew2net commented 5 years ago

@ronaldtse I see drafts with codes included Rev. 1, Rev. 2, Rev. 3. For example SP 800-52 Rev. 2 (DRAFT) Could we consider this part of the code as status iteration? If yes does it mean drafts without the part of code have status iteration final?

andrew2net commented 5 years ago

@ronaldtse I think we can use document history to calculate an iteration.

See iterations here https://github.com/metanorma/metanorma-nist/issues/116#issuecomment-488677211

opoudjis commented 5 years ago

@ronaldtse @opoudjis where we can scrape the iteration number?

Dunno. Ronald?

@ronaldtse does the table #12 (comment) mean we should search:

I'm not convinced I have heard clearly articulated rules yet. I know that the updated date is only meant to be present in errata releases. (Minor changes which are not considered significant enough to result in a version update.) For drafts, I have been putting the circulated-date (date of last issued draft) in the citation, and reserving undated references for final. I'm not sure that is the rule. I also don't know when final drafts get dates, and I suspect that is a legacy convention instead of proper Revision notation.

@ronaldtse, question remains open, and I am concerned that there won't be clear guidance from NIST about this, because they have not been historically consistent in their use of dates.

@ronaldtse could a short ref be with stage and without edition? For example SP 800-162 (2PD).

Don't know. Like I said, I am putting the date in.

@ronaldtse I see drafts with codes included Rev. 1, Rev. 2, Rev. 3. For example SP 800-52 Rev. 2 (DRAFT) Could we consider this part of the code as status iteration? If yes does it mean drafts without the part of code have status iteration final?

Regrettably, revision and iteration are two completely different things.

So the following are all possible:

After the refactoring, the revision is indicated in bibitem as <edition>Revision 1</edition>, <edition>Revision 2</edition>: it is an attribute of the document, whatever stage it is in. The iteration, on the other hand, is specifically an attribute of the draft stage.

If there is no "Rev. 1" in the document title, do not populate <edition>. And do not assume anything about iterations from published documents, as your last comment suggests. The only place you will ever see iterations in the website is in descriptions of drafts, as IPD, 2PD, 3PD, ..., FPD (or: initial public draft, second public draft, third public draft... final public draft). Again: iterations are attributes only of drafts, not of published documents.

ronaldtse commented 5 years ago

@andrew2net sorry for the late reply.

@ronaldtse @opoudjis where we can scrape the iteration number?

This is not available from the NIST Publications site. It will be available from the new CSRC web service for Metanorma (which is not yet available).

@ronaldtse does the table #12 (comment) mean we should search: for Undated reference references all type docs (final, draft)

An undated reference means that it refers to the latest Final (with the latest updated-date), but in the bib item it should also be undated.

for Final without updated-date only final type of docs where an issued date is as in the reference.

A Final reference only refers to the latest Final (with the latest updated-date), but the bib item should keep it without the updated-date.

for Final where updated-date > original-release-date only final type docs where an updated date is as in the reference?

Yes. In the bib item, it should also specify the full information about the updated-date too.

@ronaldtse could a short ref be with stage and without edition? For example SP 800-162 (2PD).

Yes! NIST edition means Rev. X. Edition = 1 --> Rev. 1, which is not displayed (implied). So SP 800-162 (2PD) means it is the 2nd Public Draft of SP 800-162 (which is Edition = 1).

@ronaldtse I see drafts with codes included Rev. 1, Rev. 2, Rev. 3. For example SP 800-52 Rev. 2 (DRAFT) Could we consider this part of the code as status iteration? If yes does it mean drafts without the part of code have status iteration final?

No. The Rev. X means "edition". "SP 800-52 Rev. 2 (DRAFT)" means it is the PD (public draft) of "SP 800-52 Rev. 2" (draft of edition = 3).

Edition applies to the document (document as project). Iteration applies to a document stage. They are different.

@ronaldtse I think we can use document history to calculate an iteration. if history is empty or the current document is first in history then an iteration is initial (1)

Yes -- but iteration only applies to "Drafts".

All stages have an "initial" iteration (iteration = 1). This is abbreviated as "I", such as "IPD" (Initial Public Draft), "IPreD" (Initial Preliminary Draft).

if the document isn't first in history then a position of the document is an iteration value (2,3)

Yes, again only for drafts. Or 4, 5, or whatever.

if the last document in history is final and current document position in history is before the last document then iteration is final

It doesn't work like this, I should have clarified earlier.

The "final" iteration is applied ad-hoc -- only in some documents the authors put "final". Most of the time, an "IPD" (iteration = 1), "2PD" (iteration = 2) or "3PD" (iteration = 2) becomes the last PD. So this cannot be automated. Most documents never have an "FPD".

Now for @opoudjis :

If there is no "Rev. 1" in the document title, do not populate . And do not assume anything about iterations from published documents, as your last comment suggests. The only place you will ever see iterations in the website is in descriptions of drafts, as IPD, 2PD, 3PD, ..., FPD (or: initial public draft, second public draft, third public draft... final public draft). Again: iterations are attributes only of drafts, not of published documents.

Why not use edition = 1 for non-Rev, and edition = 2 for "Rev. 1"?

I'm not convinced I have heard clearly articulated rules yet. I know that the updated date is only meant to be present in errata releases. (Minor changes which are not considered significant enough to result in a version update.) For drafts, I have been putting the circulated-date (date of last issued draft) in the citation, and reserving undated references for final. I'm not sure that is the rule. I also don't know when final drafts get dates, and I suspect that is a legacy convention instead of proper Revision notation. @ronaldtse, question remains open, and I am concerned that there won't be clear guidance from NIST about this, because they have not been historically consistent in their use of dates.

We have agreed on the rules with NIST:

I feel that we actually need to bind a bibliographic date with a document stage...

opoudjis commented 5 years ago

Why not use edition = 1 for non-Rev, and edition = 2 for "Rev. 1"?

Because I don't trust them not to make up a text-based revision, or to otherwise break this; and I don't trust them, for that matter, not to end up using editions as well as revisions. This is more future-proof.

The rules you list are for when dates are used; what is still not clear to me is when identifiers (other than errata releases) are undated, and when they are dated by publication date. Presumably the latter are legacy, but Andrej is still going to be seeing them.

I feel that we actually need to bind a bibliographic date with a document stage...

The question has come before. The way to do this is prolix, but safe: it is to make each draft a related bibitem to the main document bibdata, and put the circulated date in the separate bibitem for each draft. That keeps the model clean.