relaton / relaton-iso

RelatonIso: ISO Standards metadata using the BibliographicItem model
BSD 2-Clause "Simplified" License
2 stars 1 forks source link

Incorrect document identifier assigning to bibliographic item #112

Open mico opened 2 years ago

mico commented 2 years ago

When I do request for "ISO 19115-1" identifier, first match will be "ISO 19115-1:2014 ED1(en,fr)", but as document identifier it returns "ISO 19115-1":

3.1.1 :002 > bibitem = RelatonIso::IsoBibliography.get("ISO 19115-1")
[relaton-iso] ("ISO 19115-1") fetching...
[relaton-iso] ("ISO 19115-1") found ISO 19115-1:2014
 =>
#<RelatonIsoBib::IsoBibliographicItem:0x0000000113401e88
...
3.1.1 :003 > bibitem.docidentifier
 =>
[#<RelatonBib::DocumentIdentifier:0x00000001133fb498
  @id="ISO 19115-1",
  @primary=true,
  @scope=nil,
  @type="ISO">,
 #<RelatonBib::DocumentIdentifier:0x00000001133fb3a8
  @id="urn:iso:std:iso:19115:-1:stage-90.93:ed-1:en,fr",
  @primary=false,
  @scope=nil,
  @type="URN">]

But URN identifier is correct, it includes edition and languages. I believe it should be something like:

3.1.1 :003 > bibitem = RelatonIso::IsoBibliography.get("ISO 19115-1")
[relaton-iso] ("ISO 19115-1") fetching...
[relaton-iso] ("ISO 19115-1") found ISO 19115-1:2014 ED1(en,fr)
 =>
#<RelatonIsoBib::IsoBibliographicItem:0x00000001122c1c40
...
3.1.1 :005 >bibitem.docidentifier
 =>
[#<RelatonBib::DocumentIdentifier:0x00000001122b8aa0
  @id="ISO 19115-1 ED1(en,fr)",
  @primary=true,
  @scope=nil,
  @type="ISO">,
 #<RelatonBib::DocumentIdentifier:0x00000001122b88e8
  @id="urn:iso:std:iso:19115:-1:stage-90.93:ed-1:en,fr",
  @primary=false,
  @scope=nil,
  @type="URN">]

Am I right? Should we drop year for PubID and URN identifiers?

ronaldtse commented 2 years ago

When I do request for "ISO 19115-1" identifier, first match will be "ISO 19115-1:2014 ED1(en,fr)", but as document identifier it returns "ISO 19115-1":

This is the correct behavior.

Notice that in ISO identifiers there are two types:

  1. Dated reference. If you enter "ISO 19115-1:2014", you will get the bibliographic item for "ISO 19115-1:2014" (the bibliographic item of this document).
  2. Undated reference. This means you will get a bibliographic item "ISO 19115-1" that has a relationship to the latest "ISO 19115-1" bibliographic item.
    • i.e. you will get an "Undated reference" that currently points to "ISO 19115-1:2014".
    • If tomorrow "ISO 19115-1:2022" is published, this "Undated reference" will point to "ISO 19115-1:2022".

I believe it should be something like:

That would be incorrect. A bibitem for "ISO 19115-1" is different from a bibitem for "ISO 19115-1:2014".

Instead, in the original output, the URN is incorrect:

@id="urn:iso:std:iso:19115:-1:stage-90.93:ed-1:en,fr"

It should have been:

@id="urn:iso:std:iso:19115:-1: stage-60.60" (default stage is 60.60, no edition, no language)

So this is a bug.

ronaldtse commented 2 years ago

@andrew2net just to clarify, the URN returned for the undated reference is incorrect.

ronaldtse commented 2 years ago

@mico can we delegate this undated/dated difference in identifiers and the URN generation to pubid-iso?

mico commented 2 years ago

@ronaldtse in the tests for RelatonIso::IsoBibliography.get("ISO 19115", nil, lang: "en") request it's expecting

  <docidentifier type="ISO" primary="true">ISO 19115</docidentifier>
  <docidentifier type="URN">urn:iso:std:iso:19115:stage-95.99:ed-1:en</docidentifier>

it's not correct, right?

Should it be

  <docidentifier type="ISO" primary="true">ISO 19115(en)</docidentifier>
  <docidentifier type="URN">urn:iso:std:iso:19115:stage-60.60:en</docidentifier>

?

So we include all requested parameters in docidentifier?

andrew2net commented 2 years ago

@mico the relatno-iso doesn't work correctly with the lang attribute yet. The Realton model allows for containing many languages in one bibliographic item. When the relaton-iso fetches a document it has to get an English version first. From the English page, it can get links to other available languages. So if we don't need the Eglish version we should drop it even if we get the data already. Maybe it'd be better to get all available languages, put them into Relaton model (and cache), and render output with needed language. @ronaldtse what do you think?

ronaldtse commented 2 years ago

Undated language-unspecified reference

This:

RelatonIso::IsoBibliography.get("ISO 19115")

Should return:

  <docidentifier type="ISO" primary="true">ISO 19115</docidentifier>

As @andrew2net pointed out, the reference to an ISO document is not language specific.

But this:

RelatonIso::IsoBibliography.get("ISO 19115", nil, lang: "en")

Should return:

  <docidentifier type="ISO" primary="true">ISO 19115(en)</docidentifier>

or

  <docidentifier type="ISO" primary="true">ISO 19115(E)</docidentifier>

Stage code

Regarding:

  <docidentifier type="URN">urn:iso:std:iso:19115:stage-95.99:ed-1:en</docidentifier>

The stage code is optional as per RFC 5141:

   docidentifier = originator [":" type] ":" docnumber [":" partnumber]
                   [[":" status] ":" edition]
                   [":" docversion] [":" language]

If the status is not provided, there is no need to provide the stage code in a URN.

ronaldtse commented 9 months ago

@andrew2net @mico what's the hold up on this issue? Thanks.

andrew2net commented 9 months ago

@ronaldtse this is current behavior, so it is correct for language-unspecific references

This:

RelatonIso::IsoBibliography.get("ISO 19115")

Should return:

  <docidentifier type="ISO" primary="true">ISO 19115</docidentifier>

Fetching language-specific ISO document is not implemented yet. I tried to do it some time ago, but without success. ISO uses an Online Browsing Platform. The platform makes XHR requests I haven't managed to reproduce yet.

But this:

RelatonIso::IsoBibliography.get("ISO 19115", nil, lang: "en")

Should return:

  <docidentifier type="ISO" primary="true">ISO 19115(en)</docidentifier>

or

  <docidentifier type="ISO" primary="true">ISO 19115(E)</docidentifier>