plazi / arcadia-project

2 stars 1 forks source link

first taxonomic treatment in BLR #102

Closed myrmoteras closed 4 years ago

myrmoteras commented 4 years ago

Just a throught, now that we have the taxonomic treatment in place.

Hi Donat & Guido, Everything is deployed and good to go in both systems now. We've added the "openbiodiv:TaxonomicConceptLabel" custom keyword as well. Guido, feel free to shoot things at any of the systems, and let me know if you come across any unexpected errors. Cheers, Alex

Why don't we think for a moment which should be the first treatment we upload before Guido starts with the upload?

it should be one that we can use to demonstrate various aspects of what we intend to do, such as adding all the links in the metadata, the kind of custom metatdata, the link to GBIf and from GBIF,

Any ideas?

gsautter commented 4 years ago

How about the big one we've been using in the sandbox tests? It contains all the kinds of citations/links we have, and several instances of each: http://tb.plazi.org/GgServer/html/03C88554FF8FFFB2FF53F9E86AD50709

gsautter commented 4 years ago

Or re we rather looking for a specific species, or species from a specific genus or family?

myrmoteras commented 4 years ago

not a Zootaxa!

I guess best would be a EJT including a treatment with a new species, one with a treatment citation, and with a link to a specimen / DNA accession number both with links. Ideally it would be one, where we have the cited treatment as well, so that we at some point could add the link too. If you are aware of one, then lets use it.

gsautter commented 4 years ago

OK, I'll try and dig something up from the stats.

gsautter commented 4 years ago

Profundiconus virginiae (European Journal of Taxonomy 173), original description with links on several figure and materials citations: http://tb.plazi.org/GgServer/html/24768796CD18FFCBFDF716FEFD45FAD9

I cannot seem to find an EJT treatment with a linked treatment citation ... any other preferences?

myrmoteras commented 4 years ago

this is a good one. I am adding the specimen code as well. I hope, this is still ok httpUri-0, httpUri-1

gsautter commented 4 years ago

The numbered HTTP URIs are definitely good, just make sure to start with 0 and not omit an intermediate number. If there is only one HTTP URI, omit the numbering altogether, i.e., httpUri instead of httpUri-0.

myrmoteras commented 4 years ago

when you look at the one above, it has in this holotype both a linked specimenCode as well as a Accession code. but the linking works diffrently:

image

what is the correct way - or the way you want to have it? to process properly

gsautter commented 4 years ago

Hard one ... this is just one instance of a more general problem: where to put the multitude of potential outgoing links in an MC? We don't have general policy for this thus far, and the stats (along with the TB website) cover only links on the materialsCitation proper at the moment. The latter is rather easy to change, however. A general policy might be to put links as closely as possible to the plain text part representing them:

Suggestions, please ... or opinions, additions, corrections, whatever.

myrmoteras commented 4 years ago

if this is possible to add the httpUri/doi to the tagged element as you state this above, that's the best I can imagine.

We need also to think in the longer term and further downstream. How does this translate into Taxpub? How does this translate into the GG XML we add to the Treatment deposit? How does this translate into the DWCA that we serve GBIF and potential further clients? How does this translate into the treatment Deposit metadata '- eventually this should be a dwc custom element "Genaccession link", "specimen code link", "BOLDAcession link"

gsautter commented 4 years ago

Regarding TaxPub, maybe the way the Pensoft guys represent the UUIDs might be a way to go.

The GG XML added to the deposit has a blacklist based filter, so any newly added elements will go through at first ... it is basically the same XML as we use throughout the respective stages of the server with a few obsolete elements and attributes stripped out.

In DwC-A, we likely have to work on a case-by-case basis, depending on the availability of corresponding DwC terms.

In depositions, we might simply augment and use the upcoming typing system for related identifiers.

All of these points should be separate tickets, however. Let's get this one back on focus, please.

gsautter commented 4 years ago

One more on the general problem, though: While representing all these links in our various export formats is an important aspect, I tend to think the even more important one is how we can automatically create or import these links, be it as part of the batch or via a dedicated function in some menu in GGI. Without the latter, we will just end up with a multitude of fields that are empty in 99.9% of our data.

myrmoteras commented 4 years ago

if there is no content, then just don't use the field?!

gsautter commented 4 years ago

While this is very much possible in annotation attributes inside documents proper, as well as Zenodo depositions, it is way harder a thing to do in DwC-As, and outright impossible in the stats (in a database table, a column always exists in all records, which is a fundamental principle of relational databases).

gsautter commented 4 years ago

And in more general terms, what good is supporting a property or field throughout our infrastructure if it is empty in 99.9% of the cases?

gsautter commented 4 years ago

Not saying we have to implement all the automation right away, just wanting to say that for the whole effort to really make sense beyond a few demo cases, it's something we need to consider.

And now back to the original focus of this ticket, please.

gsautter commented 4 years ago

I did a live fire test, and it's looking good ... what do you think? See https://zenodo.org/record/3472000

gsautter commented 4 years ago

No worries, the automated export on all treatment updates is still deactivated. It's manual firing only at this point, but from the live system.

myrmoteras commented 4 years ago

@gsautter shall we start the upload of treatments using ejt-558 to ejt-562

FF880222F712FC5DFFCFF2516A65FFED FF911130FFFFA06CFF92FFE182165A16 F625FF92A82FFFE9EF02FFD06856FF96 FFDE57604005FFF2FFFE5345FF9AFFE5 FFF9AA2EAD4E6123FFD5A44F5B5CFFB2 CD31FFC5FFC4FFFBFFF5FFBDFFF16009

gsautter commented 4 years ago

Uploading as I write this.

myrmoteras commented 4 years ago

@gsautter can you pleas upload this one? 2A6BF5710C5EFFAFDA74DF66FFA31E0D - its one for which we have valid taxpub