Open mico opened 2 years ago
Most of the cases related to revision, edition and version parsing. More cases with version appeared after I added letters parsing for version (was only numbers before) Seems there are conflict between revision and edition parsing, sometimes they use the same pattern.
Is the last part here is part of document number or volume or edition?
✅ | NBS IR 79-1591r1 | NBS IR 79-1591-1 | ✅ | NBS.IR.79-1591r1 | NBS.IR.79-1591-1 | Semiconductor technology program : progress briefs ✅ | NBS IR 79-1591r2 | NBS IR 79-1591-2 | ✅ | NBS.IR.79-1591r2 | NBS.IR.79-1591-2 | Semiconductor technology program : progress briefs ✅ | NBS IR 79-1591r3 | NBS IR 79-1591-3 | ✅ | NBS.IR.79-1591r3 | NBS.IR.79-1591-3 | Semiconductor technology program : progress briefs
"NBS IR 79-1591-1" is actually "NBS IR 79-1591" (it wasn't planned to have a Part 2)
This is "NBS IR 79-1592-2": https://www.govinfo.gov/content/pkg/GOVPUB-C13-fadfa9b4ba69df13e12016cce96392dd/pdf/GOVPUB-C13-fadfa9b4ba69df13e12016cce96392dd.pdf
From the contents, we can see the contents are completely different. Part 1 reports on a period July-Sept 1978. Part 2 reports on a period Oct-Dec 1978.
So these are PARTS.
This is part of the number, right?
✅ | NBS IR 80-2111r1 | NBS IR 80-2111-1 | ✅ | NBS.IR.80-2111r1 | NBS.IR.80-2111-1 | Review and refinement of ATC 3-06 tentative seismic provisions : report of Technical Committee 1 : seis ✅ | NBS IR 80-2111r11 | NBS IR 80-2111-11 | ✅ | NBS.IR.80-2111r11 | NBS.IR.80-2111-11 | Review and refinement of ATC 3-06 tentative seismic provisions : report of Joint Committee on Revie ✅ | NBS IR 80-2111r2 | NBS IR 80-2111-2 | ✅ | NBS.IR.80-2111r2 | NBS.IR.80-2111-2 | Review and refinement of ATC 3-06 tentative seismic provisions : report of Technical Committee 2 : stru ✅ | NBS IR 80-2111r3 | NBS IR 80-2111-3 | ✅ | NBS.IR.80-2111r3 | NBS.IR.80-2111-3 | Review and refinement of ATC 3-06 tentative seismic provisions : report of Technical Committee 3 : foun
Yes these are PARTS.
What is these ones?
✅ | NBS IR 84-2857r1 | NBS IR 84-2857-1 | ✅ | NBS.IR.84-2857r1 | NBS.IR.84-2857-1 | Center for electronics and electrical engineering : technical progress bulletin covering center program ✅ | NBS IR 84-2857r2 | NBS IR 84-2857-2 | ✅ | NBS.IR.84-2857r2 | NBS.IR.84-2857-2 | Center for electronics and electrical engineering : technical progress bulletin covering center program ✅ | NBS IR 84-2857r3 | NBS IR 84-2857-3 | ✅ | NBS.IR.84-2857r3 | NBS.IR.84-2857-3 | Center for electronics and electrical engineering : technical progress bulletin covering center program
These are also PARTS.
Is it correct?
✅ | NBS CIRC 25sup1924 | N BS CIRC 25sup-1924 | ✅ | NBS.CIRC.25sup1924 | NBS.CIRC.25sup-1924 | Supplement to circular no. 25: standard samples issued or in preparation
This is not supplement number 1924 but published date is 1924.
https://nvlpubs.nist.gov/nistpubs/Legacy/circ/nbscircular25sup-1924.pdf
Last part parsed as an edition but not removed from document number:
✅ | NBS CIRC 74-1937e1937 | NBS CIRC 74e1937 | ✅ | NBS.CIRC.74-1937e1937 | NBS.CIRC.74-1937 | Circular of the Bureau of Standards C74: radio instruments and measurements ✅ | NBS HB 44-1988e1988 | NIST HB 44-1988 | ✅ | NBS.HB.44-1988e1988 | NBS.HB.44-1988 | Specifications, tolerances, and other technical requirements for weighing and measuring devices ✅ | NBS IR 84-2946e2946 | NIST IR 84-2946 | ✅ | NBS.IR.84-2946e2946 | NBS.IR.84-2946 | Polymers Division, technical activities 1984 ✅ | NBS IR 89-4220e4220 | NIST IR 89-4220 | ✅ | NBS.IR.89-4220e4220 | NBS.IR.89-4220 | Electrical performance tests for storage oscilloscopes ✅ | NBS FIPS 100-1e1 | NIST FIPS 100-1 | ✅ | NBS.FIPS.100-1e1 | NBS.FIPS.100-1 | Federal Information Processing Standards Publication: interface between data terminal equipment (DTE) and ✅ | NBS FIPS 130-1988e1988 | NIST FIPS 130-1988 | ✅ | NBS.FIPS.130-1988e1988 | NBS.FIPS.130-1988 | Federal Information Processing Standards Publication: for information systems - intellig ✅ | NBS FIPS 131-1987e1987 | NIST FIPS 131-1987 | ✅ | NBS.FIPS.131-1987e1987 | NBS.FIPS.131-1987 | Federal Information Processing Standards Publication: small computer system interface (S ✅ | NBS FIPS 132-1987e1987 | NIST FIPS 132-1987 | ✅ | NBS.FIPS.132-1987e1987 | NBS.FIPS.132-1987 | Federal Information Processing Standards Publication: guideline for software verificatio ✅ | NBS FIPS 29-2e2 | NIST FIPS 29-2 | ✅ | NBS.FIPS.29-2e2 | NBS.FIPS.29-2 | Federal Information Processing Standards Publication: interpretation procedures for federal information proces ✅ | NBS FIPS 4-1e1 | NIST FIPS 4-1 | ✅ | NBS.FIPS.4-1e1 | NBS.FIPS.4-1 | Federal Information Processing Standards Publication: representation for calendar date and ordinal date for inform ✅ | NBS FIPS 5-2e2 | NIST FIPS 5-2 | ✅ | NBS.FIPS.5-2e2 | NBS.FIPS.5-2 | Federal Information Processing Standards Publication: codes for the identification of the States, the District of
This is happening mostly because we have different parsing rules for NBS FIPS and NIST FIPS, NIST HB and NBS HB, NIST IR and NBS IR. But besides adding edition it's also changing publisher from NIST to NBS when merging (DOI using NBS publisher vs NIST in PubID). @ronaldtse What do we do when publishers are different for DOI and PubID?
What do we do when publishers are different for DOI and PubID?
One is correct and one not. Need to individually check the conflicting entries.
Parse error for all records with NBS CS v\d+n\d+ pattern:
✅ | parse error | NBS CS v6n1 | ✅ | parse_error | NBS.CS.v6n1 | Commercial standards monthly : a review of progress in commercial standardization and simplification July 1929
@ronaldtse there is a problem with NBS CS v\d+n\d+: documents are not available (404) https://nvlpubs.nist.gov/nistpubs/Legacy/CS/nbscommercialstandardv6n11.pdf
I found this document here instead: https://nvlpubs.nist.gov/nistpubs/Legacy/CSM/nbscsmv6n1.pdf
Can we have a list of the CS documents not available so we can file an issue at their official repo? Thanks.
UPDATE: Reported to NIST
By doing nist-pubid report -u I found there are records with missing or incorrect data (skipped some similar data):
Related to #104 (closed issue by mistake)
Last part parsed as an edition but not removed from document number:
Last part of NBS HB parsed as an edition and revision was added as merging result with DOI, result should have only revision:
Edition parsed as edition and revision:
Used part number from DOI, result should be: NBS HB 28-1969p1, need to update DOI
sp = spanish translation
revision? year missing in the final PubID:
Extra letter "P" in document identifier for NBS SP with part:
Wrong document identifier parsing for NIST PUB \d{3}-\d-19??
Broken NIST/NBS FIPS parsing for edition with month and year:
Missing edition year from original PubID (could be a DOI merging result):
Added version as revision parsing result:
NBS IR volumes parsed as revision
Is the last part here is part of document number or volume or edition?
This is part of the number, right?
What is these ones?
Part of edition parsed as version:
Part of edition parsed as revision:
Edition parsed as edition and revision and version inside edition:
Parse error for all records with
NBS CS v\d+n\d+
pattern:Is it correct?
@ronaldtse do you have any comments on these?