Closed andrew2net closed 6 years ago
Yes, we have already serialised it as XML, as part of the ISOXML document model which incidentally exposes bibliographic items, as references in standards documents.
https://github.com/riboseinc/bib-models/tree/master/grammars
It's expressed in RelaxNG Compact; the attached script compiles it to RelaxNG straight, but RelaxNG Compact is far more legible.
And since downstream tools already use this serialisation (all the suite of Asciidoctor-X standards gems), I see no reason to come up with a different serialisation.
Exactly as said by @opoudjis , the output should be in the ISOXML bibliography format. Thanks!
@opoudjis should RelaxNG Compact define start
in grammar
? doc says it must be but I can't find
@ronaldtse @opoudjis Looks like biblio.rnc
describes BibliographicItem
serialisation. But IsoBibliographicItem
has differences. Is there rnc
file for IsoBibliographicItem
?
They are embedded in the Iso XML spec, at https://github.com/riboseinc/isodoc-models/blob/master/grammars/isostandard.rnc. You will need to extract them out; I had no need to create a separate ISO-Biblio.
Let's keep in mind that the goal is to have asciidoctor-iso to utilize this gem (isobib
) to create its metadata and also for its citation. The steps that seem necessary are:
biblio.rnc
and IsoBibliographicItem
, and see whether we can merge them or branch them.asciidoctor-iso
use isobib
gem to construct its metadata structure and output IsoBibliographicItem
in its ISOXML output.asciidoctor-iso
auto-fetch ISO references using the isobib
scraper feature@andrew2net does this sound reasonable, and could you help with them? Thanks!
@ronaldtse where can we scrape ContributorRole.description
?
There is currently no online resource that allows scraping of ContributorRole.description
. For ISO documents, we only need to know which TC created the document, which is provided on the standard's page.
@ronaldtse is there a way to scrape notes
and partof
?
@opoudjis In biblio.rnc
status
is FormattedString
, but in UML Bibliography
status
is DocumentStatus
which inherit LocalizedString
which is superclass of FormattedString
. Is it ok?
The uml takes priority. Fixed RNC.
@andrew2net could you help clarify what notes
and partof
are? Thanks.
@ronaldtse in biblio.rnc
there is biblionote
which is likely BibliographicItem.notes
in Bibliography UML Models.
partof
is in biblio.rnc
but isn't in Bibliography UML Models. Suppose it is backlink for BibliographicItem.relations
. Is it?
BibData =
attribute type { BibItemType }?,
btitle+, formattedref?, source*, docidentifier*, bdate*, contributor*, edition?,
biblionote*, partof*, language*, script*, abstract?, status?, copyright, docrelation*
}
... I suspect partOf got superseded by relations, which is a late addition to the model; @ronaldtse , could you confirm?
Confirmed -- partOf
is a doc relation now.
@ronaldtse ok, partOf
no need anymore. how about notes
? where can we scrape it?
Is notes just abstract?
@opoudjis we have ContributionAssociation.owner
as Contributor
in Bibliography UML Models, and as ContributorInfo
in biblion.rnc
. Does UML always take priority in such cases?
@ronaldtse Do you ask me if notes are abstract? Hmm... I don't know. Looks like they are, but I'm not sure.
Notes also include the details of status of an unpublished standard; in fact that's what they're currently used for in isodoc (the footnote for the "—" date).
Removed partOf from RNC.
I chose to rename Contributor from UML as ContributorInfo in the RNC in order to differentiate the element from the type (the latter is recycled elsewhere; and as a type, it is not visible as a name in the serialisation):
contributor =
element contributor {
role*,
ContributorInfo
}
ContributorInfo =
( person | organization )
For the serialisation, ContributorInfo will never actually turn up in the content, so the difference doesn't matter.
@ronaldtse since ContributionAssociation.owner
is ContributorInfo
with attribute entity
which is person
or organization
where to scrape identifier
and contact
?
@ronaldtse on www.iso.org in document relations section (Revisions / Corrigenda) relations have types like "Previously", "Now", "Corrigenda/Amendments" and so on. In biblio.rnc
allowed types are "parent, "child", "obsoletes", "updatedBy", "complements", "derivedFrom", "adoptedFrom", "equivalent", "identical" and "nonequivalent". How to coreelate them?
The ContributorInfo
will be an organization
, ISO itself. ISO's contact details are:
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Switzerland
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
@ronaldtse relation
contain bibitem
, according to biblio.rnc
. In my opinion, it should contain some text identifier (like ISO 19115:2003/Cor 1:2006
) instead. Related bibitem
could have other relation, so it could be enormous data in XML.
Yes, the UML diagram specifies that relation links to a BibItem, but it doesn’t specify the implementation.
I imagine that the link to the BibItem could be also
@ronaldtse should relation/locality/referenceFrom
contain something like ISO 19115:2003/Cor 1:2006
? and what should type in rlation/locality[type=?]
contain?
Great question @andrew2net !
relation/locality
is meant to allow a document to refer to parts of another document.
A document corrigenda ("Cor") or amendement ("Amd") is a separate document. While they may only correct parts of a standard, we can view the document as updating the entire standard. So this means that:
type:updatedBy
"ISO 19115:2003/Cor 1:2006
".type:updates
"ISO 19115:2003
".@opoudjis since we only have "type:updatedBy", should we add "type:updates" (or we need some way to deal with reverse associations)?
We should have type:updates, and we will need it for RFC anyway...
... But we already have type:obsoletes; that means the same thing as updates, doesn't it?
Agree. But an amendment updates a document, not obsoletes it. Unless we allow a reversible relation — subject action object
Let's not. I'll add updates.
Done
OK — we should actually have a full description of doc relations somewhere. Maybe on the stanDoc site later.
OK — we should actually have a full description of doc relations somewhere. Maybe on the stanDoc site later.
We do, and am writing it in the Relaton-Models readme for now, as part of https://github.com/riboseinc/relaton/issues/19
@ronaldtse do you have defined file structure for storing serialized
IsoBibliographicItem
s? Which format of the file should we use?