relaton / relaton-iso

RelatonIso: ISO Standards metadata using the BibliographicItem model
BSD 2-Clause "Simplified" License
2 stars 1 forks source link

Differences between the biblio.rnc and IsoBibliographicItem #19

Closed andrew2net closed 6 years ago

andrew2net commented 6 years ago
  1. IsoBibliographicItem.documentidentifier In IsoBibliographicItem it is IsoDocumentId, in biblio.rnc it is DocumentId.
  2. IsoBibliographicItem.title In IsoBibliographicItem it is IsoLocalizedTitle, in biblio.rnc it is FormattedString.
  3. IsoBibliographicItem.type biblio.rnc has: "article" | "book" | "booklet" | "conference" | "manual" | "proceedings" | "presentation" | "thesis" | "techreport" | "standard" | "unpublished" IsoBibliographicItem has: “internationalStandard” | “techinicalSpecification” | “technicalReport” | “publiclyAvailableSpecification” | “internationalWorkshopAgreement”
  4. IsoBibliographicItem.status IsoBibliographicItem.status has additional attributes stage, substage and iteration.
  5. IsoBibliographicItem.workgroup There is a workgroup attribute in IsoBibliographicItem but there isn’t in biblio.rnc.
  6. IsoBibliographicItem.ics Isn’t ICS in biblio.rnc.
  7. ContributorRole.description There is currently no online resource that allows scraping it.
  8. ContributionInfo.entity In IsoBibliographicItem ContributionInfo.entity can be IsoProjectGroup, Organization or Person. In biblio.rnc it can be only Organization or Person.
  9. notes Are notes abstract?
ronaldtse commented 6 years ago
  1. IsoBibliographicItem.documentidentifier In IsoBibliographicItem it is IsoDocumentId, in biblio.rnc it is DocumentId.

Should be IsoDocumentId.

  1. IsoBibliographicItem.title In IsoBibliographicItem it is IsoLocalizedTitle, in biblio.rnc it is FormattedString.

I've changed the model to have 1 IsoLocalizedTitle, it takes 3 parts, each a FormattedString.

  1. IsoBibliographicItem.type biblio.rnc has: "article" | "book" | "booklet" | "conference" | "manual" | "proceedings" | "presentation" | "thesis" | "techreport" | "standard" | "unpublished" IsoBibliographicItem has: “internationalStandard” | “techinicalSpecification” | “technicalReport” | “publiclyAvailableSpecification” | “internationalWorkshopAgreement”

We should keep the IsoBibliographicItem ones.

  1. IsoBibliographicItem.status IsoBibliographicItem.status has additional attributes stage, substage and iteration.

Yes, follow IsoBibliographicItem.

  1. IsoBibliographicItem.workgroup There is a workgroup attribute in IsoBibliographicItem but there isn’t in biblio.rnc.

Yes we should add it back to the grammar.

  1. IsoBibliographicItem.ics Isn’t ICS in biblio.rnc.

It should be added to biblio.rnc.

  1. ContributorRole.description There is currently no online resource that allows scraping it.

Correct, there is no online resource that you can scrape it from.

  1. ContributionInfo.entity In IsoBibliographicItem ContributionInfo.entity can be IsoProjectGroup, Organization or Person. In biblio.rnc it can be only Organization or Person.

biblio.rnc should be updated to reflect this.

  1. notes Are notes abstract?

Can you clarify? Thanks!

andrew2net commented 6 years ago

@ronaldtse we have notes in BibliographicItem and I don't know where it could be scrapped. Some teme ago in #15 you said that maybe abstract is notes.

andrew2net commented 6 years ago

@ronaldtse I got XML from asciidoctor-iso with fetched ISO references, but the validator gives me a lot of messages. One of them about source. In isobib we have zero or multiple sources with the type attribute. In asciidoctor-iso validator allow only one source without type attribute. Should I change validation rules in asciidoctor-iso?

opoudjis commented 6 years ago

One of them about source. In isobib we have zero or multiple sources with the type attribute. In asciidoctor-iso validator allow only one source without type attribute. Should I change validation rules in asciidoctor-iso?

I've fixed this.

opoudjis commented 6 years ago

@ronaldtse So... a bunch of ISO elements have to be recoded in the production grammar. I'm going to go through the changes you appear to be proposing, for confirmation. Note that I'm going through the XML serialisations of the UML. I will go through one at a time.

Current:

docidentifier =
  element docidentifier {
    text | (documentnumber, tc-documentnumber? )
}

documentnumber =
  element project-number {
    attribute part { xsd:int }?,
    attribute subpart { xsd:int }?,
    xsd:int
   }

tc-documentnumber =
  element tc-document-number {
    xsd:int
   }

New structure:

docidentifier =
  element docidentifier {
    text | (documentnumber, tc-documentnumber?, part-number? )
}

Please confirm you want part-number to move out of documentnumber, and that it applies to tc-documentnumber as well. You'll then need to subtype IsoDocumentId to IecDocumentId, to allow for subparts as well.

opoudjis commented 6 years ago

IsoBibliographicItem.title In IsoBibliographicItem it is IsoLocalizedTitle, in biblio.rnc it is FormattedString.

I've changed the model to have 1 IsoLocalizedTitle, it takes 3 parts, each a FormattedString.

isostandard.rnc already does this, by allowing either kind of title to appear, overriding the biblio.rnc definition of btitle:

btitle =
  element title {
     FormattedString |
    ( title-intro?, title-main, title-part? )
}

title-intro =
  element title-intro { FormattedString }

title-main =
  element title-main { FormattedString }

title-part =
  element title-part { FormattedString }
opoudjis commented 6 years ago

IsoBibliographicItem.type biblio.rnc has: "article" | "book" | "booklet" | "conference" | "manual" | "proceedings" | "presentation" | "thesis" | >>"techreport" | "standard" | "unpublished" IsoBibliographicItem has: “internationalStandard” | “techinicalSpecification” | “technicalReport” | >>“publiclyAvailableSpecification” | “internationalWorkshopAgreement”

We should keep the IsoBibliographicItem ones.

isostandard.rnc already does this, by adding the ISO-specific types to the generic types. (This is consistent behaviour in the grammar: it has to parse both ISO and non-ISO references.)

BibItemType |=
     "international-standard" | "technical-specification" |
     "technical-report" | "publicly-available-specification" |
    "international-workshop-agreement" | "guide"
opoudjis commented 6 years ago

I've run out of time for now, but the general guidance is, have a look at the ISOstandard.rnc in the Isodoc-models repository, and how it subclasses the original biblio.rnc . The answers to your questions are in the ISO-specific serialisation. Sorry for not pointing that out earlier.

ronaldtse commented 6 years ago

@opoudjis since IEC and ISO share the Directives, we prefer to have a single model apply to both. We should have document-identifier support a subpart as well.

The tc-document-number is just a serial number of a document within the TC and has no meaning to the project. For example, any distributed document within the TC will have a tc-document-number. .

Is DocumentIdentifier itself the full identifier of the ISO document which contains the "ISO" or "IEC" or "ISO/IEC" prefix? For example, "ISO/IEC 14888-3"? The project number here should be "14888-3".

opoudjis commented 6 years ago

DocumentIdentifier currently does not contain the "ISO/IEC"; that is inferred by looking up the publishers of the standard.

ronaldtse commented 6 years ago

Indeed, the model doesn't include it right now. However, I was wondering that perhaps we should, because for normal people who use the document, it can serve as a reasonable to citeAs.

For example, without intimate knowledge of the model, it would be difficult for someone to infer that a document is dual published by ISO and IEC and therefore should have the 'ISO/IEC' prefix.

opoudjis commented 6 years ago

My takeaway from this is that the grammar does not actually need to change at all.

The citeas attribute already includes the prefix: we tell people to cite documents as <<ref2131,ISO/IEC 123>>, where what follows the comma is how the document will be cited. So a reference "ISO/IEC 123" is not constructed for bibliographic citations; it is is only build in bibdata, for the current document. (The bibliographic references are still supposed to include ISO and other publishers, for the gem to verify that they are ISO-like references if they are normative.)

... So I'm not sure anything needs to change. Yes, this means that we are depending on people to get the prefix right when they cite documents, with an obligatory citeAs (<<ref2131,ISO/IEC 123>>, never just <<ref2131>>); but I'm not actually convinced that's a bad thing. If you think it is a bad thing, then we change the docidentifier value.

As it turns out, the default value for citeAs if none is provided is docidentifier anyway; so if we did change the docidentifier value, we could let people drop the obligatory citeAs.

ronaldtse commented 6 years ago

I think we should allow the docidentifer to be the default citeAs and let people cite an auto-fetched document without a citeAs (when the doc reference is being fetched via IsoBib, just like how it works in AsciiRFC).

opoudjis commented 6 years ago

https://github.com/riboseinc/asciidoctor-iso/issues/142

ronaldtse commented 6 years ago

Thanks @opoudjis ! @andrew2net could you take the appropriate action here? Thanks!

andrew2net commented 6 years ago

@ronaldtse no problem will add a prefix to documentidentifier

ronaldtse commented 6 years ago

Thanks @andrew2net !

opoudjis commented 6 years ago

Has this ticket been resoolved, @andrew2net? If not, do you need any further action from me?

andrew2net commented 6 years ago

@opoudjis I have no question related to the ticket. Suppose we could close it.