plazi / arcadia-project

2 stars 1 forks source link

meeting 20190708 skype #60

Open myrmoteras opened 5 years ago

myrmoteras commented 5 years ago
  1. finalize v0.9 of data dictionary
    • Guido, Terry, Donat, Marcus
    • start now; finish by June 3, 2019

1.1. Terry, Guido, Donat, Marcus develop an GGXML version that has minimal structural elements,focus on semantics and uses for this the tags that map to the agreed terms in the data dictionary 1.2. Guido to make changes in TB to use the agreed tags, and export XML

  1. run treatment extraction from XMLs to test the data dictionary

    • Puneet, Marcus
    • start June 3, 2019; finish by June 5, 2019
  2. final tweaks to the data dictionary and finalize v1.0

    • Guido, Terry, Donat, Marcus
    • start June 5, 2019; finish by June 7, 2019 Note: this may not be required if we do it right in #1-2 above

3.1 Guido to make changes in TB to use the agreed tags, and export XML

  1. run treatment extraction from XMLs

    • Puneet, Marcus
    • start June 7, 2019; finish June 8, 2019 Note: this may not be required if we do it right in #1-2 above
  2. set up auto-update process for processing treatments from new XMLs

    • Puneet, Marcus
    • start June 5-7, 2019; finish June 7-10, 2019
  3. load ~15K-30K treatments into Zenodo sandbox

    • Guido, Alex (CERN folks)
    • start June 5-7, 2019; finish ~Jun 10-12, 2019
  4. test sandbox API

    • Puneet, Marcus, Guido
    • start June 10-12, 2019; continue into rest of June and July, 2019
myrmoteras commented 5 years ago

Meeting notes:

  1. Get the data dictionary done by end of July. Get version 1.0 out 1.1. GGXML Puneet to provide feedback to what Guido has made 1.2. Guido to make changes in TB to use the agreed tags, and export XML

  2. Agreed 3 formats of treatments. A Primary digital object: XHTML, version of record B Description: HTML with minimal formatting (

    , may be italics, and some other elements for better viewing C Upload files: DWCA, TaxPub (should eventually become the default mime type; to be defined by TC), simplified GG Version ("puneet" version)

3, The treatment deposit is type"section". In the midterm we need to convince DataCite to create a type"taxonTreatment"

  1. Include treatmentCitation in the upload, linking to existing treatment can be done later

  2. Upload of 15-30K treatments in the sandbox, starting July 15 (GS). The documentation is here

  3. DA, MG to run searches on the corpus and report back.

  4. Next skype July 22/23. DA to organize

The issue of data workflow has been raised by Lars https://github.com/plazi/arcadia-project/issues/61 and needs our attention