metanorma / pubid-ieee

PubID spec and implementation for IEEE deliverables
BSD 2-Clause "Simplified" License
1 stars 0 forks source link

Parse "IEEE Unapproved P802.21/D-14-2008-Sept" #110

Closed mico closed 3 months ago

mico commented 3 months ago

closes #107

mico commented 3 months ago
> Pubid::Ieee::Identifier.parse("IEEE Unapproved 11073.10471/D-02-2008-02")
/Users/andrej/RubyProjects/pubid-ieee/lib/pubid/ieee/identifier/base.rb:102:in `rescue in parse': Expected one of [IEEE_WITHOUT_PREFIX, (ORGANIZATIONS SPACE)? type_status:((draft_status:DRAFT_STATUS SPACE)? ('Draft '? type:TYPE SPACE?)?) NUMBER_PREFIX NUMBER parameters:((DRAFT / PART_SUBPART_YEAR? CORRIGENDUM? DRAFT? ISO_AMENDMENT?) PUBLICATION_DATE? edition:EDITION? DUAL_PUBIDS? ADDITIONAL_PARAMETERS?), ISO_IDENTIFIER ISO_PARAMETERS, ISO_IDENTIFIER SPACE (ORGANIZATIONS SPACE)? type_status:((draft_status:DRAFT_STATUS SPACE)? ('Draft '? type:TYPE SPACE?)?) NUMBER_PREFIX NUMBER parameters:((DRAFT / PART_SUBPART_YEAR? CORRIGENDUM? DRAFT? ISO_AMENDMENT?) PUBLICATION_DATE? edition:EDITION? DUAL_PUBIDS? ADDITIONAL_PARAMETERS?), ISO_IDENTIFIER] at line 1 char 1. (Pubid::Core::Errors::ParseError)
...

As I said there are many "Unapproved" IDs. Please test this gem aginst all IDs in the index file.

@andrew2net this is another task. (was not mentioned We are already running tests again many "Unapproved" identifiers here https://github.com/metanorma/pubid-ieee/blob/main/spec/fixtures/pubid-parsed.txt

Do you want me to make a test that will run through all identifiers in https://github.com/relaton/relaton-data-ieee/blob/main/index-v1.yaml ? @ronaldtse does it make sense to do that?

andrew2net commented 3 months ago

@andrew2net this is another task. (was not mentioned We are already running tests again many "Unapproved" identifiers here https://github.com/metanorma/pubid-ieee/blob/main/spec/fixtures/pubid-parsed.txt

We need to update these tests because the dataset has been updated since then.

Do you want me to make a test that will run through all identifiers in https://github.com/relaton/relaton-data-ieee/blob/main/index-v1.yaml ?

I agree that testing against all the IDs is outside of this PR. But this PR needs to be tested against all the Unapproved IDs in the IEEE dataset. To be able to use pubid-ieee with relaton-ieee we need to parse all the IDs in the dataset. So let's create another PR.

ronaldtse commented 3 months ago

Yes, this will be necessary as we need to parse all identifiers. Adding another PR is good.

andrew2net commented 3 months ago

@mico I was wrong. We need to test against doc identifiers in this file ieee-rawbib.csv

The file contains normtitle and sdtnumber elements from IEEE dataset as well as filename in the dataset. Here is parser that currently used in relaton-ieee https://github.com/relaton/relaton-ieee/blob/main/lib/relaton_ieee/rawbib_id_parser.rb

UPD I have only quite old version of the dataset. @ronaldtse can we get the latest updated ieee-rawbib from IEEE for testing purpose?