Open ronaldtse opened 2 years ago
@ronaldtse we agreed before to use "pubid-iso" to parse ISO identifiers. Should it be on the "pubid-ieee" level or "relaton" level? I believe "relaton" should decide what parser to use... Also there are difference with what "pubid-iso" can parse now and what we have in our PubIDs list. Most of "ISO" identifiers there "pubid-iso" will not be able to parse without fixes, for example:
ISO/IEC 10861 : 1994 [ANSI/IEEE Std 1296, 1994 Edition]
ISO/IEC 13213 : 1994 [ANSI/IEEE Std 1212, 1994 Edition]
ISO/IEC 14515-1:2000 IEEE Std 2003.1-2000
we agreed before to use "pubid-iso" to parse ISO identifiers. Should it be on the "pubid-ieee" level or "relaton" level? I believe "relaton" should decide what parser to use...
I believe pubid-ieee should implement the link to re-use pubid-iso. Relaton technically does not do anything with the PubID except that it needs a PubID to identify a bibliographic item.
In the examples:
ISO/IEC 10861 : 1994 [ANSI/IEEE Std 1296, 1994 Edition]
is ISO/IEC 10861:1994
with an alternative PubID ANSI/IEEE Std 1296, 1994 Edition
ISO/IEC 13213 : 1994 [ANSI/IEEE Std 1212, 1994 Edition]
is ISO/IEC 13213:1994
with an alternative PubID [ANSI/IEEE Std 1212, 1994 Edition
ISO/IEC 14515-1:2000 IEEE Std 2003.1-2000
is ISO/IEC 14515-1:2000
with an alternative PubID IEEE Std 2003.1-2000
We could update pubid-iso to handle these, or make pubid-ieee handle them?
We could update pubid-iso to handle these, or make pubid-ieee handle them?
To parse them in pubid-ieee I need to duplicate parsing code from pubid-iso. Better to update pubid-iso for these identifiers.
Also I have an idea, maybe we should introduce something like this https://github.com/relaton/relaton/blob/main/spec/relaton/registry_spec.rb for PubIDs? e.g.:
expect(PubID::Registry.parse("ISO/IEC 13213")).to be_instance_of PubId::Iso::Identifier
expect(PubID::Registry.parse("IEEE/ANSI Std 484-1987")).to be_instance_of PubId::Ieee::Identifier
I fully agree with this. The challenge is on what is considered an ISO vs IEEE identifier. For example,
expect(PubID::Registry.parse("ISO/IEC 13213")).to be_instance_of PubId::Iso::Identifier
expect(PubID::Registry.parse("IEEE/ANSI Std 484-1987")).to be_instance_of PubId::Ieee::Identifier
There are however some challenges. Look at these identifiers:
ISO/IEEE DIS P11073-10418 D13, January 2011
ISO/IEEE DIS P11073-10418/D15, June 2011
ISO/IEEE DIS P11073-10418_D8, July 2010
ISO/IEEE P11073-20601a/D29, July 2010
ISO/IEEE P11073-20601a/D31, August 2010
The numbers are in IEEE format (Pnnnn-{part}/D{draft}
), but starts with ISO. The first 3 even uses the ISO stage code DIS
. From the ISO perspective, the P
and D
parts do not make any sense. These identifiers can only make sense from the IEEE perspective.
The numbers are in IEEE format (
Pnnnn-{part}/D{draft}
), but starts with ISO. The first 3 even uses the ISO stage codeDIS
. From the ISO perspective, theP
andD
parts do not make any sense. These identifiers can only make sense from the IEEE perspective.
So we should check not only the prefix but also if it's parsable by matched module. For "ISO/IEEE P11073-20601a/D31, August 2010" by splitting publishers: "ISO" and "IEEE" we can check which module can parse it first and return correct instance.
For "ISO/IEEE P11073-20601a/D31, August 2010" by splitting publishers: "ISO" and "IEEE" we can check which module can parse it first and return correct instance.
Perhaps. I do not know whether we can know which parse operation is correct by simply comparing the outcomes though. Maybe the ISO version will fail once it encounters "Pnnnn"?
I'm not sure if we should handle this via parse rules where we can integrate multiple Parslet sub-rules...
Perhaps. I do not know whether we can know which parse operation is correct by simply comparing the outcomes though. Maybe the ISO version will fail once it encounters "Pnnnn"?
Yes, one parser will fail. If both succeeded to parse, it means both should return correct results.
I'm not sure if we should handle this via parse rules where we can integrate multiple Parslet sub-rules...
I also like the idea to include parse rules from another gem. For example we can use "pubid-iso" Parslet rules inside "pubid-ieee" to parse ISO part for identifiers like: "ISO/IEC13210: 1994 (E) ANSI/IEEE Std 1003.3-1991" Another idea: we can define which parsers to use by datasets. For example we know "ieee-rawbib2" include ISO and IEEE formats
Yeah let's try out these ideas!
@ronaldtse I'm trying to parse this identifiers, for me it looks like identifier with 2 dual-PubIDs, but I believe I could be wrong. "IEEE Std 802.5r and IEEE 802.5j, 1998 Edition (ISO/IEC 8802-5:1998/Amd.1)" What is strange here it represend 2 dual-PubIDs with different format (first using " and " and second one use identifier inside brackets). What is "IEEE Std 802.5r and IEEE 802.5j" here? (maybe this could help https://ieeexplore.ieee.org/document/827772)
Topic of "IEEE Std 802.5r and IEEE 802.5j" moved to #55
There are many PubIDs that are co-published. This means that the different organizations share the same "standard number". This also means that the "standard number syntax" will either comply with the IEEE syntax, or the syntax of the other publisher.
ASTM format:
ANSI format:
ISO, IEC or ISO/IEC format but with the IEEE
P
applied:Purely ISO, IEC or ISO/IEC format: