plazi / ggxml2taxpub

Conversion of GoldenGATE XML to JATS/TaxPub at treatment level
0 stars 1 forks source link

description of level-1 #21

Open myrmoteras opened 2 years ago

myrmoteras commented 2 years ago

@tcatapano do we have somewhere a description of level-1 following issue #17 . let me know so I can document this since more people get involved in this discussion.

tcatapano commented 2 years ago

We do not have a description of the minimal "level-1" encoding. However, briefly, it includes:

treatment-meta

at the phrase level inside treatment secs, taxon names and material-citation strings are encoded.

A formal expression of the de fact schema from the markup in the current 500 instance sample is:

default namespace = ""
namespace tp = "http://www.plazi.org/taxpub"

start =
  element tp:taxon-treatment {
    element tp:treatment-meta {
      element mixed-citation {
        element named-content {
          attribute content-type { xsd:NCName },
          text
        },
        (element article-title { text }
         | element uri {
             attribute content-type { xsd:NCName },
             xsd:anyURI
           })+
      }
    },
    element tp:nomenclature { taxon-name },
    treatment-sec*
  }
taxon-name = element tp:taxon-name { text }
treatment-sec =
  element tp:treatment-sec {
    attribute sec-type { xsd:NCName },
    (treatment-sec
     | element p {
         (text
          | taxon-name
          | element tp:material-citation { (text | taxon-name)+ })+
       })*
  }
tcatapano commented 2 years ago

added de facto level1 schema to https://github.com/plazi/ggxml2taxpub-treatments/blob/ffb8d9cecb7dec0aec41980ca85dd4969c74282d/profiling/level1.rnc

myrmoteras commented 2 years ago

@tcatapano should we add in the metadata a comment defining the level of taxpub, eg level=1

I think that would be a clever move.

tcatapano commented 2 years ago

@myrmoteras I agree it is a good idea to have an indication of the level of TaxPub markup in the instances metadata. See #32