Closed myrmoteras closed 4 years ago
How about the big one we've been using in the sandbox tests? It contains all the kinds of citations/links we have, and several instances of each: http://tb.plazi.org/GgServer/html/03C88554FF8FFFB2FF53F9E86AD50709
Or re we rather looking for a specific species, or species from a specific genus or family?
not a Zootaxa!
I guess best would be a EJT including a treatment with a new species, one with a treatment citation, and with a link to a specimen / DNA accession number both with links. Ideally it would be one, where we have the cited treatment as well, so that we at some point could add the link too. If you are aware of one, then lets use it.
OK, I'll try and dig something up from the stats.
Profundiconus virginiae (European Journal of Taxonomy 173), original description with links on several figure and materials citations: http://tb.plazi.org/GgServer/html/24768796CD18FFCBFDF716FEFD45FAD9
I cannot seem to find an EJT treatment with a linked treatment citation ... any other preferences?
this is a good one. I am adding the specimen code as well. I hope, this is still ok httpUri-0, httpUri-1
The numbered HTTP URIs are definitely good, just make sure to start with 0 and not omit an intermediate number. If there is only one HTTP URI, omit the numbering altogether, i.e., httpUri
instead of httpUri-0
.
when you look at the one above, it has in this holotype both a linked specimenCode as well as a Accession code. but the linking works diffrently:
what is the correct way - or the way you want to have it? to process properly
Hard one ... this is just one instance of a more general problem: where to put the multitude of potential outgoing links in an MC? We don't have general policy for this thus far, and the stats (along with the TB website) cover only links on the materialsCitation
proper at the moment. The latter is rather easy to change, however.
A general policy might be to put links as closely as possible to the plain text part representing them:
specimenCode
annotations if available (using the numbering mechanism on specimen code ranges), and on the materialsCitation
proper only if no specimen codes are annotated.Suggestions, please ... or opinions, additions, corrections, whatever.
if this is possible to add the httpUri/doi to the tagged element as you state this above, that's the best I can imagine.
We need also to think in the longer term and further downstream. How does this translate into Taxpub? How does this translate into the GG XML we add to the Treatment deposit? How does this translate into the DWCA that we serve GBIF and potential further clients? How does this translate into the treatment Deposit metadata '- eventually this should be a dwc custom element "Genaccession link", "specimen code link", "BOLDAcession link"
Regarding TaxPub, maybe the way the Pensoft guys represent the UUIDs might be a way to go.
The GG XML added to the deposit has a blacklist based filter, so any newly added elements will go through at first ... it is basically the same XML as we use throughout the respective stages of the server with a few obsolete elements and attributes stripped out.
In DwC-A, we likely have to work on a case-by-case basis, depending on the availability of corresponding DwC terms.
In depositions, we might simply augment and use the upcoming typing system for related identifiers.
All of these points should be separate tickets, however. Let's get this one back on focus, please.
One more on the general problem, though: While representing all these links in our various export formats is an important aspect, I tend to think the even more important one is how we can automatically create or import these links, be it as part of the batch or via a dedicated function in some menu in GGI. Without the latter, we will just end up with a multitude of fields that are empty in 99.9% of our data.
if there is no content, then just don't use the field?!
While this is very much possible in annotation attributes inside documents proper, as well as Zenodo depositions, it is way harder a thing to do in DwC-As, and outright impossible in the stats (in a database table, a column always exists in all records, which is a fundamental principle of relational databases).
And in more general terms, what good is supporting a property or field throughout our infrastructure if it is empty in 99.9% of the cases?
Not saying we have to implement all the automation right away, just wanting to say that for the whole effort to really make sense beyond a few demo cases, it's something we need to consider.
And now back to the original focus of this ticket, please.
I did a live fire test, and it's looking good ... what do you think? See https://zenodo.org/record/3472000
No worries, the automated export on all treatment updates is still deactivated. It's manual firing only at this point, but from the live system.
@gsautter shall we start the upload of treatments using ejt-558 to ejt-562
FF880222F712FC5DFFCFF2516A65FFED FF911130FFFFA06CFF92FFE182165A16 F625FF92A82FFFE9EF02FFD06856FF96 FFDE57604005FFF2FFFE5345FF9AFFE5 FFF9AA2EAD4E6123FFD5A44F5B5CFFB2 CD31FFC5FFC4FFFBFFF5FFBDFFF16009
Uploading as I write this.
@gsautter can you pleas upload this one? 2A6BF5710C5EFFAFDA74DF66FFA31E0D - its one for which we have valid taxpub
Just a throught, now that we have the taxonomic treatment in place.
Why don't we think for a moment which should be the first treatment we upload before Guido starts with the upload?
it should be one that we can use to demonstrate various aspects of what we intend to do, such as adding all the links in the metadata, the kind of custom metatdata, the link to GBIf and from GBIF,
Any ideas?