plazi / arcadia-project

2 stars 1 forks source link

On treatment titles/labels #46

Open mguidoti opened 5 years ago

mguidoti commented 5 years ago

So, I keep having this discussion with Donat and he asked me today to open this issue and get you all involved. I'm talking about treatment titles or labels.

Essentially, treatments are named exclusively by the taxon name that it describes or adds information. However, I think we need to add more to it, in order to avoid confusion and to provide useful taxonomic information to the end users (taxonomists).

The first thing that I think it should be always there is the name authority (original author of the species). Sometimes this author will be in between parenthesis, which has taxonomic meaning as well.

In addition, I believe it would be great if we follow that by 'sec.', plus the name of the author of the treatment. 'Sec.' means 'according to' and it's used in this context, as you can see in this reference.

Example: Sphaerocysta globifera (Stal, 18xx) sec. Guidoti, 2019

For plants this would be a bit longer because they use all authors who proposed nomenclatural acts involving that given name. But that's a particularity.

Sometimes the author of the species might be missing on the treatment (bad taxonomy from the authors). On these cases, I suggest to leave only the 'sec. Author-of-the-treatment, Year-of-the-paper'.

This is important because:

  1. If the same taxon has several treatments, we would have a way to tell them apart by the title.
  2. There are cases where different authors have different definitions of a given taxonomic name, and these might receive different citations in further works. By adding the author of the treatment to the treatment label, it would be easier to track these cases.

I think this is it, to start this discussion.

What are your thoughts?

Thanks!

myrmoteras commented 5 years ago

What I understand from @mguidoti this is about labeling the treatments, for example in BLR.

@gsautter can you please provide access to the convention you and Reto use in the RDF?

@tcatapano we need your input here too.

lyubomirpenev commented 5 years ago

All this is right and useful, Marcus, however Plazi extracts treatments from PUBLISHED literature which means it would not be correct to add our own interpretation (for example taxon concept) on the top of author's opinion. If extracted, treatments should be represented as they are in the original paper (sometimes there can be no authority for a name - should we add it? I do not think so).

This would make sense only if the label is used internally (e.g. for indexing , etc.), or is explicitly marked as Plazi's addition/interpretation to the original treatment label.

Best regards,

Lyubomir

gsautter commented 5 years ago

I mostly agree with @mguidoti on the taxon. However, I wouldn't add a "sensu" kind of thing (that's how I understand "sec." at this point) there in all cases:

The latter are also the ones that have the lion's share of the "bad taxonomy" cases, but add they ever do is add more geographical data, which hardly qualifies as an augmentation of a taxon concept, so no justification for a "sensu" kind of interpretation.

gsautter commented 5 years ago

The way we handle it in LoD is that a treatment can do one out of three things:

gsautter commented 5 years ago

In the most general terms, a treatment to me is a container holding some data (of a variety of types) on a specific taxon concept, with the latter being identified by the treatment taxon (name plus authority).

While this definition is very general, it has proven pretty versatile and a good guideline for system design decisions - never had to change it in well over a decade.

mguidoti commented 5 years ago

Hi @gsautter,

I think your definition is exactly the way I understand as treatments at this point. And I think you're right when you say that not all treatments augment the data to a given taxon concept. Simple checklists or catalogs, that don't present new distributional records, fell into this category. In these cases, the 'sec.' thing indeed makes no sense - there is no interpretation being made, just a compilation of the already published data.

But in all other cases where there is an interpretation, of any kind, on the taxon concept (e.g., new descriptive information, new images, or, new distributional data), in these cases, the 'sec.' could be meaningful for taxonomists and future programs automatically reading the data.

But how to tell cool checklists/catalogs (with new records) from simpler checklists/catalogs? The way people mark this on papers varies a lot (thinking about legacy papers, not xml marked ones). The problem seems to be with these two.

Also, I'm talking about labels, as @myrmoteras said, @lyubomirpenev. And I do understand your point, @lyubomirpenev. But as Plazi's mission is to 'liberate' data, in some way, I think that including those labels 'liberates' the information because it adds a very useful information that derives from the treatment itself. The treatment body should remain unchanged, of course, because I understand that it's not our goal to change data. But liberate. Right?

I think this like a tag, in a way.

Am I way off on this?

tcatapano commented 5 years ago

There seem to be two issues here:

  1. construction of a string value for a field in a treatment deposition's metadata
  2. which field to use

I dont really have an opinion on 1, and trust @mguidoti's recommendation (taxon name + authority + "sec" + author of treatment + year of treatment)

but I would recommend that this string be used in a custom metadata field rather than the treatment's title field. And for the custom metadata field, I'd recommend using openbiodiv's taxonomicConceptLabel class to type it. For example:

"custom": {
"obkms:taxonomicConceptLabel" : "Sphaerocysta globifera (Stal, 18xx) sec. Guidoti, 2019"
...
mguidoti commented 5 years ago

Hi,

So, we made a decision of using a custom metadata field rather than the title, and that this "sec." thing is interesting to add, and that every treatment, regardless the content, should have it. Three decisions were made.

But are we adding a tag on the treatment's xml metadata for this (GoldenGate/Guido's side), or, we won't have a tag at all?

As @gsautter said, the information is already there, and we could be picking the different parts from different tags/attributes combinations. But if it's something that we all consider to be important because it adds important information to some extent (and auto-generate cybertalogs will clearly benefit from it), wouldn't be better to have it on a tag or attribute somewhere?

Just asking.

Let us know your opinion, please.