relaton / relaton-iso

RelatonIso: ISO Standards metadata using the BibliographicItem model
BSD 2-Clause "Simplified" License
2 stars 1 forks source link

Serialization #15

Closed andrew2net closed 6 years ago

andrew2net commented 6 years ago

@ronaldtse do you have defined file structure for storing serialized IsoBibliographicItems? Which format of the file should we use?

opoudjis commented 6 years ago

Yes, we have already serialised it as XML, as part of the ISOXML document model which incidentally exposes bibliographic items, as references in standards documents.

https://github.com/riboseinc/bib-models/tree/master/grammars

It's expressed in RelaxNG Compact; the attached script compiles it to RelaxNG straight, but RelaxNG Compact is far more legible.

opoudjis commented 6 years ago

And since downstream tools already use this serialisation (all the suite of Asciidoctor-X standards gems), I see no reason to come up with a different serialisation.

ronaldtse commented 6 years ago

Exactly as said by @opoudjis , the output should be in the ISOXML bibliography format. Thanks!

andrew2net commented 6 years ago

@opoudjis should RelaxNG Compact define start in grammar? doc says it must be but I can't find

andrew2net commented 6 years ago

@ronaldtse @opoudjis Looks like biblio.rnc describes BibliographicItem serialisation. But IsoBibliographicItem has differences. Is there rnc file for IsoBibliographicItem?

opoudjis commented 6 years ago

They are embedded in the Iso XML spec, at https://github.com/riboseinc/isodoc-models/blob/master/grammars/isostandard.rnc. You will need to extract them out; I had no need to create a separate ISO-Biblio.

ronaldtse commented 6 years ago

Let's keep in mind that the goal is to have asciidoctor-iso to utilize this gem (isobib) to create its metadata and also for its citation. The steps that seem necessary are:

  1. Describe differences between the current biblio.rnc and IsoBibliographicItem, and see whether we can merge them or branch them.
  2. Make asciidoctor-iso use isobib gem to construct its metadata structure and output IsoBibliographicItem in its ISOXML output.
  3. Make asciidoctor-iso auto-fetch ISO references using the isobib scraper feature

@andrew2net does this sound reasonable, and could you help with them? Thanks!

andrew2net commented 6 years ago

@ronaldtse where can we scrape ContributorRole.description?

ronaldtse commented 6 years ago

There is currently no online resource that allows scraping of ContributorRole.description. For ISO documents, we only need to know which TC created the document, which is provided on the standard's page.

andrew2net commented 6 years ago

@ronaldtse is there a way to scrape notes and partof?

andrew2net commented 6 years ago

@opoudjis In biblio.rnc status is FormattedString, but in UML Bibliography status is DocumentStatus which inherit LocalizedString which is superclass of FormattedString. Is it ok?

opoudjis commented 6 years ago

The uml takes priority. Fixed RNC.

ronaldtse commented 6 years ago

@andrew2net could you help clarify what notes and partof are? Thanks.

andrew2net commented 6 years ago

@ronaldtse in biblio.rnc there is biblionote which is likely BibliographicItem.notes in Bibliography UML Models. partof is in biblio.rnc but isn't in Bibliography UML Models. Suppose it is backlink for BibliographicItem.relations. Is it?

BibData =
    attribute type { BibItemType }?,
    btitle+, formattedref?, source*, docidentifier*, bdate*, contributor*, edition?,
    biblionote*, partof*, language*, script*, abstract?, status?, copyright, docrelation*

}
opoudjis commented 6 years ago

... I suspect partOf got superseded by relations, which is a late addition to the model; @ronaldtse , could you confirm?

ronaldtse commented 6 years ago

Confirmed -- partOf is a doc relation now.

andrew2net commented 6 years ago

@ronaldtse ok, partOf no need anymore. how about notes? where can we scrape it?

ronaldtse commented 6 years ago

Is notes just abstract?

andrew2net commented 6 years ago

@opoudjis we have ContributionAssociation.owner as Contributor in Bibliography UML Models, and as ContributorInfo in biblion.rnc. Does UML always take priority in such cases?

andrew2net commented 6 years ago

@ronaldtse Do you ask me if notes are abstract? Hmm... I don't know. Looks like they are, but I'm not sure.

opoudjis commented 6 years ago

Notes also include the details of status of an unpublished standard; in fact that's what they're currently used for in isodoc (the footnote for the "—" date).

Removed partOf from RNC.

opoudjis commented 6 years ago

I chose to rename Contributor from UML as ContributorInfo in the RNC in order to differentiate the element from the type (the latter is recycled elsewhere; and as a type, it is not visible as a name in the serialisation):

contributor =
  element contributor {
    role*,
    ContributorInfo
}

ContributorInfo =
  ( person | organization )

For the serialisation, ContributorInfo will never actually turn up in the content, so the difference doesn't matter.

andrew2net commented 6 years ago

@ronaldtse since ContributionAssociation.owner is ContributorInfo with attribute entity which is person or organization where to scrape identifier and contact?

andrew2net commented 6 years ago

@ronaldtse on www.iso.org in document relations section (Revisions / Corrigenda) relations have types like "Previously", "Now", "Corrigenda/Amendments" and so on. In biblio.rnc allowed types are "parent, "child", "obsoletes", "updatedBy", "complements", "derivedFrom", "adoptedFrom", "equivalent", "identical" and "nonequivalent". How to coreelate them?

ronaldtse commented 6 years ago

The ContributorInfo will be an organization, ISO itself. ISO's contact details are:

ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Switzerland
Tel.  + 41 22 749 01 11
Fax  + 41 22 749 09 47
E-mail  copyright@iso.org
Web  www.iso.org
andrew2net commented 6 years ago

@ronaldtse relation contain bibitem, according to biblio.rnc. In my opinion, it should contain some text identifier (like ISO 19115:2003/Cor 1:2006) instead. Related bibitem could have other relation, so it could be enormous data in XML.

ronaldtse commented 6 years ago

Yes, the UML diagram specifies that relation links to a BibItem, but it doesn’t specify the implementation.

I imagine that the link to the BibItem could be also

andrew2net commented 6 years ago

@ronaldtse should relation/locality/referenceFrom contain something like ISO 19115:2003/Cor 1:2006? and what should type in rlation/locality[type=?] contain?

ronaldtse commented 6 years ago

Great question @andrew2net !

relation/locality is meant to allow a document to refer to parts of another document.

A document corrigenda ("Cor") or amendement ("Amd") is a separate document. While they may only correct parts of a standard, we can view the document as updating the entire standard. So this means that:

@opoudjis since we only have "type:updatedBy", should we add "type:updates" (or we need some way to deal with reverse associations)?

opoudjis commented 6 years ago

We should have type:updates, and we will need it for RFC anyway...

... But we already have type:obsoletes; that means the same thing as updates, doesn't it?

ronaldtse commented 6 years ago

Agree. But an amendment updates a document, not obsoletes it. Unless we allow a reversible relation — subject action object

opoudjis commented 6 years ago

Let's not. I'll add updates.

opoudjis commented 6 years ago

Done

ronaldtse commented 6 years ago

OK — we should actually have a full description of doc relations somewhere. Maybe on the stanDoc site later.

opoudjis commented 5 years ago

OK — we should actually have a full description of doc relations somewhere. Maybe on the stanDoc site later.

We do, and am writing it in the Relaton-Models readme for now, as part of https://github.com/riboseinc/relaton/issues/19