relaton / relaton-bipm

MIT License
2 stars 0 forks source link

Decide and encode BIPM outcome "authoritative identifiers" #34

Closed ronaldtse closed 1 year ago

ronaldtse commented 2 years ago

Right now we have "CGPM Meeting 26" etc. but these need to be updated according to https://github.com/metanorma/bipm-si-brochure/issues/193

From that ticket:

From BIPM Janet Miles:

When referring to the Resolutions I’d provided you with the English description but not the French: in French the model for “CGPM Resolution X (YYYY)” would be

And there’s another slight complication in that the wording is “de la” for the CGPM, but “du” for the CIPM and Consultative Committees:

If this proves problematic we’ll have to find another solution… On the French version of the BIPM website my shortest version of the references is:


I believe this is a Relaton issue. The data source at bipm-data-outcomes does not provide this text.

In realton-data-bipm, this text is provided: https://github.com/relaton/relaton-data-bipm/blob/9fd0943b7b142a23903bee0a8386463736e1a1af/data/cipm/meeting/resolution/1948-00.yaml#L26-L40

docid:
- id: CIPM Resolution (1948)
  type: BIPM
  primary: true
- id: CIPM Resolution (1948)
  type: BIPM
  primary: true
  language: en
  script: Latn
- id: CIPM Résolution (1948)
  type: BIPM
  primary: true
  language: fr
  script: Latn
docnumber: CIPM Resolution (1948)
andrew2net commented 2 years ago

@ronaldtse do we need to update references, identifiers, or both?

ronaldtse commented 1 year ago

@andrew2net we have confirmed with BIPM to use the following patterns:

As agreed with BIPM's Janet Miles:

Basic pattern

General:

Long: {group name} -- {type} {number} ({year})
Short: {group name} -- {type-abbrev} {number} ({year}, {lang})

Special case pattern

The basic pattern works fine for all.

Except for these 2 cases:

Where the committee name is part of the outcome identifier:

Special cases:

Decision CIPM/111-10 (2022) / Décision CIPM/111-10 (2022)
Recommendation JCRB/43-1 (2021) / Recommandation JCRB/43-1 (2021)

Single language version (English)

type can be:

e.g.

CCTF -- Recommendation 2 (1970)
CCTF -- REC 2 (1970, EN)

Special cases:

Decision CIPM/111-10 (2022)
CIPM DECN CIPM/111-10 (2022, EN)
Recommendation JCRB/43-1 (2021)
JCRB REC JCRB/43-1 (2021, EN)

Single language version (French)

type can be:

e.g.

CCTF -- Recommandation 2 (1970)
CCTF -- REC 2 (1970, FR)

Special cases:

Décision CIPM/111-10 (2022)
CIPM DECN CIPM/111-10 (2022, FR)
Recommandation JCRB/43-1 (2021)
JCRB REC JCRB/43-1 (2021, FR)

Dual language version (language independent version)

type can be in their respective languages or the following abbreviations:

CCTF -- Recommandation 2 (1970) / Recommendation 2 (1970)
CCTF REC 2 (1970)

Special cases:

Decision CIPM/110-10 (2022) / Décision CIPM/111-10 (2022)
CIPM DECN CIPM/110-10 (2022)
Recommendation JCRB/43-1 (2021) / Recommandation JCRB/43-1 (2021)
JCRB REC JCRB/43-1 (2021)
andrew2net commented 1 year ago

@ronaldtse it seems there is numeration inconsistency in the bibm-data-outcom dataset.

... and many others

As we discussed earlier, the numbering must be continuous through document parts.

ronaldtse commented 1 year ago

cipm/meetings-en/meeting-104-1.yml has identifiers 1 and 2. cipm/meetings-en/meeting-104-2.yml has same identifiers 1 and 2. Also, it seems, the identifiers should be prefixed with the document number, but already there are 104-1 and 104-2 in the cipm/meetings-en/meeting-104-1.yml

To fix this we need to update how the repository is structured. The current method doesn't work well as you can see.

jcrb/meetings-en/meeting-10.yml has 2 same identifiers 10-1 and 10-1 jcrb/meetings-en/meeting-15.yml also has duplicated IDs and jcrb/meetings-en/meeting-17.yml and jcrb/meetings-en/meeting-18.yml jcrb/meetings-en/meeting-19.yml jcrb/meetings-en/meeting-20.yml jcrb/meetings-en/meeting-21.yml

These are not "duplications", because the identifiers are unique "per type".

In JCRB meeting 10;

We will need to distinguish the "Resolution" from "Action" in the importing code.

andrew2net commented 1 year ago

In JCRB meeting 10;

  • The first 10-1 is titled "Resolution JCRB/10-1 (2003)"
  • The second 10-1 is "Action JCRB/10-1 (2003)".

@ronaldtse that is a problem. Many other meetings don't have types in their titles. We take the type from a type attribute and in the JCRB meeting 10, all the resolutions have a decision type.

andrew2net commented 1 year ago
  • This is a case of a "split meeting" -- the same meeting is split into 2 sessions held at different times. There is 1 "meeting", but it has two "sub-meetings".

@ronaldtse I'm sorry, I've expressed incorrectly. There isn't a problem with the meeting's identifiers. The problem is with the resolution's identifiers. These 2 parts of the meeting have resolutions with the same identifier. For example this: https://github.com/metanorma/bipm-data-outcomes/blob/cf05beb2ac784212cfd3673d076e27b0a290004e/cipm/meetings-en/meeting-104-1.yml#L18 and this: https://github.com/metanorma/bipm-data-outcomes/blob/cf05beb2ac784212cfd3673d076e27b0a290004e/cipm/meetings-en/meeting-104-2.yml#L18

ronaldtse commented 1 year ago

@andrew2net the problem is not with the resolution's identifiers -- these are identical resolutions.

ronaldtse commented 1 year ago

We have these issues to track:

andrew2net commented 1 year ago

The new references look like these in tests: https://github.com/relaton/relaton-bipm/blob/b4797665722fe9eefe020c140b74ff7c90472ab4/spec/relaton_bipm_spec.rb#L12-L127

The docidentifiers are: https://github.com/relaton/relaton-bipm/blob/b4797665722fe9eefe020c140b74ff7c90472ab4/spec/fixtures/cctf_recommendation_1970_02.xml#L10-L12 https://github.com/relaton/relaton-bipm/blob/b4797665722fe9eefe020c140b74ff7c90472ab4/spec/fixtures/cgpm_meeting_1.xml#L10-L12 https://github.com/relaton/relaton-bipm/blob/b4797665722fe9eefe020c140b74ff7c90472ab4/spec/fixtures/cgpm_resolution_1889_00.xml#L10-L12 https://github.com/relaton/relaton-bipm/blob/b4797665722fe9eefe020c140b74ff7c90472ab4/spec/fixtures/cipm_decision_2012_01.xml#L9-L11

opoudjis commented 1 year ago

This has not yet implemented https://github.com/metanorma/metanorma-bipm/issues/216

That is because the ticket explicitly specifies that the language-neutral identifiers should be bilingual, not English. Moreover, you have not implemented the short identifiers at all.

And I am not convinced " -- " should be left as is at all. " -- " is an ASCII convention for em-dash, and I recommend it should be converted to em-dash. So CCTF—Recommendation 2 (1970).

The implementation, following what they have stated, should be:

<docidentifier type="BIPM" primary="true">CCTF—Recommandation 2 (1970)&#xa0;/ Recommendation 2 (1970)</docidentifier> 
<docidentifier type="BIPM" primary="true" language="en" script="Latn">CCTF—Recommendation 2 (1970)</docidentifier> 
 <docidentifier type="BIPM" primary="true" language="fr" script="Latn">CCTF—Recommandation 2 (1970)</docidentifier> 
<docidentifier type="BIPM-short">CCTF—REC 2 (1970)</docidentifier> 
<docidentifier type="BIPM-short" language="en" script="Latn">CCTF—REC 2 (1970, EN)</docidentifier> 
 <docidentifier type="BIPM-short" language="fr" script="Latn">CCTF—REC 2 (1970, FR)</docidentifier> 

 <docidentifier type="BIPM" primary="true">CGPM—Résolution (1889)&#xa0;/ Resolution (1889)</docidentifier> 
 <docidentifier type="BIPM" primary="true" language="en" script="Latn">CGPM—Resolution (1889)</docidentifier> 
 <docidentifier type="BIPM" primary="true" language="fr" script="Latn">CGPM—Résolution (1889)</docidentifier> 
 <docidentifier type="BIPM-short">CGPM—RES (1889)</docidentifier> 
 <docidentifier type="BIPM-short" language="en" script="Latn">CGPM—RES (1889, EN)</docidentifier> 
 <docidentifier type="BIPM-short" language="fr" script="Latn">CGPM—RES (1889, FR)</docidentifier> 

 <docidentifier type="BIPM" primary="true">Decision CIPM/101-1 (2012)&#xa0;/ Décision CIPM/101-1 </docidentifier> 
 <docidentifier type="BIPM" primary="true" language="en" script="Latn">Decision CIPM/101-1 (2012)</docidentifier> 
 <docidentifier type="BIPM" primary="true" language="fr" script="Latn">Décision CIPM/101-1 (2012)</docidentifier> 
 <docidentifier type="BIPM-short">CIPM DECN CIPM/101-1 (2012)</docidentifier> 
 <docidentifier type="BIPM-short" language="en" script="Latn">CIPM DECN CIPM/101-1 (2012, EN)</docidentifier> 
 <docidentifier type="BIPM-short" language="fr" script="Latn">CIPM DECN CIPM/101-1 (2012, EN)</docidentifier> 

These:

 <docidentifier type="BIPM" primary="true">CGPM -- Meeting 1 (1889)</docidentifier> 
 <docidentifier type="BIPM" primary="true" language="en" script="Latn">CGPM -- Meeting 1 (1889)</docidentifier> 
 <docidentifier type="BIPM" primary="true" language="fr" script="Latn">CGPM -- Réunion 1 (1889)</docidentifier> 
 <docidentifier type="BIPM" primary="true">CGPM -- Meeting 1 (1889)</docidentifier> 
 <docidentifier type="BIPM" primary="true" language="en" script="Latn">CGPM -- Meeting 1 (1889)</docidentifier> 
 <docidentifier type="BIPM" primary="true" language="fr" script="Latn">CGPM -- Réunion 1 (1889)</docidentifier> 

are illegal as far as https://github.com/metanorma/metanorma-bipm/issues/216 is concerned: there is NO provision for "meeting" as a reference type. So @ronaldtse is going to have to confer with BIPM on what is expected there.

Please ensure you have implemented ALL the patterns specified in https://github.com/metanorma/metanorma-bipm/issues/216

ronaldtse commented 1 year ago

Wait wait guys. This issue is actually pending on BIPM feedback here:

there is NO provision for "meeting" as a reference type.

Not yet, but there probably should be.

andrew2net commented 1 year ago

@ronaldtse can we close this issue?

ronaldtse commented 1 year ago

@andrew2net can you review the issue based on the new information here?

https://github.com/metanorma/bipm-si-brochure/pull/210

andrew2net commented 1 year ago

@ronaldtse is the #44 issue related to this? If so, can we close one of them?

andrew2net commented 1 year ago

@ronaldtse implemented references and ID parsing. Now this gem uses id components to search documents, so it allows more flexible reference formats https://github.com/relaton/relaton-bipm#basic-pattern Shoul we covert dobled dashes " -- " in the BIPM IDs (CGPM -- Meeting 1 (1889)) to em-dashes (CGPM—Meeting 1 (1889))?

ronaldtse commented 1 year ago

The authoritative identifier format has been updated as per #50 .

andrew2net commented 1 year ago

@ronaldtse can we close this issue?

ronaldtse commented 1 year ago

Yes, thank you @andrew2net !