ietf-tools / relaton-data-ieee

3 stars 5 forks source link

Weird document IDs with slashes #4

Open strogonoff opened 2 years ago

strogonoff commented 2 years ago

Example: ISO/IEC/IEEE 8802-11/Amd5.2018

ronaldtse commented 2 years ago

This is correct:

  1. The first two slashes is because this is a "tri-published" document by ISO IEC and IEEE
  2. The third slash is because this is an Amendment, and ISO/IEC amendments are indicated by the pattern "/Amd".
ronaldtse commented 2 years ago

The perils of a multi-prefixed dataset!

strogonoff commented 2 years ago

Shouldn’t “amendment” be document type (relation?) and ISO/IEC/IEEE be publishing organizations in the rest of metadata?🤔 That could also eliminate the issue of the order in which they appear.

ronaldtse commented 2 years ago

The "/Amd" is part of the document identifier of the Amendment document itself.

ronaldtse commented 2 years ago

In addition, the order of ISO/IEC/IEEE can differ per document as it is in order of the amount of contribution.

strogonoff commented 2 years ago

The "/Amd" is part of the document identifier of the Amendment document itself.

Now my questions are:

ronaldtse commented 2 years ago

This is the Pandora's box, because no one in the system has thought this through carefully, or have the power to mandate a systematic approach across these organizations.

  • Which source determines that canonical identifier?

Whoever publishes the document.

  • It looks like the document is equally (or even less) of “IEEE” type as it is of “ISO” and “IEC”. Why does our docid array only contain one ID with type "IEEE", instead of three (one for each of ISO, IEC and IEEE)?

Short answer. As per the answer above, this is the IEEE's copy, so the IEEE decides what to call it.

Long answer. This goes deeper in the content ownership model:

  1. Content: the contents of the standard. This affects the text.
  2. Document: the document the content is published as. This affects the wrapping text (e.g. introduction) and layout.
  3. Publisher: who publishes/prints/sells the document. This affects the layout.

The assignments of items 1-3 can be "mixed". For instance, ISO + ITU jointly published documents, there are 2 formalised types:

For ISO + IEEE, there is no official agreement to what these things are.

TL;DR. The best we can do is to trust the source.

strogonoff commented 2 years ago

Understood. Complex issue.

this is the IEEE's copy, so the IEEE decides what to call it.

Can such a document have multiple copies? If so, shouldn’t they conceptually be merged into a single citation with multiple docids?

ronaldtse commented 2 years ago

Can such a document have multiple copies? If so, shouldn’t they conceptually be merged into a single citation with multiple docids?

Ideally at Relaton we wish to do that.

Do understand that IEEE will publish this document with an IEEE cover page, and ISO will do so with an ISO cover page, etc.

In citation, ISO 690 allows for citing different "abstraction levels" (even though it's not called that). You can cite:

So It's really up to what the user wants to cite.

In principle, yes I think there should be 1 object that links the 3 documents together, and that there should also be an individual object for each document. The object could be represented as a page on Relaton, for example.

We should not resolve this in the scope of the BibXML service because it runs a lot, lot deeper.

strogonoff commented 2 years ago

Looks like Relaton’s design decision, whether treat all of those abstraction levels of a document as separate citations with their own metadata, or capture the higher-level umbrella entry (don’t know whether ISO 690 has anything like that) from which concrete citations could be derived…