ecotaxa / ecotaxa_front

Front end of the EcoTaxa application
Other
6 stars 6 forks source link

Consider explicit statement uncertainty or incompleteness in taxonomic identification #627

Open jiho opened 3 years ago

jiho commented 3 years ago

Cf https://doi.org/10.3389/fmars.2021.620702

And in particular:

INDETERMINABILIS (INDET.) The ON sign indet. is taken to mean that the taxon is indeterminable beyond a certain taxonomic level. This is relevant to image-based identifications, where diagnostic characters are often not visible or resolvable in the image, which could be owing to the resolution of the image, or the orientation of the taxon in the field of view. This ON sign can be applied at any taxonomic rank.

STETIT (STET.): This ON sign means it stood/stays or remained here, indicating the identification stopped at this taxonomic rank. Stetit can be employed for a variety of reasons such as indicating that it is a CHOICE to go no further, i.e. "I called this taxon 'Ostracoda stet.' because I did not attempt to identify the ostracods any further; I simply noted they were ostracods and stopped there".

INCERTA (INC.): The ON sign 'inc.' is used to indicate 'uncertain identification' and to replace the use of the question mark symbol '?'. In image-based identifications, this ON sign can be used at all taxonomic levels (e.g. Aristidae fam. inc.), while it is less likely to be used at higher taxonomic ranks when a physical specimen is available.

This needs to be coordinated with the support of such terms in WoRMS + EurOBIS.

grololo06 commented 3 years ago

+1 for attributes #456

jiho commented 3 years ago

Nicely formatted version of the summary https://obis.org/2021/02/12/on/

moi90 commented 2 years ago

To me, the only useful information would be "indet" (as in "the taxon is beyond doubt indeterminable beyond a certain taxonomic level") and maybe "inc". That someone decided to not go further is visible from the fact that they didn't.

Currently, these are used in category names, but I agree that it would be much better to have them as a flag. I would put this flag on the object-category relation. This way, there would also be no problem with WoRMS etc. (And this would open the opportunity to predict already validated objects further down the tree, if needed.)

jiho commented 2 years ago

I think SETIT is very often the case: we could have gone further but that was more work and we did not want to do it. This can be a flag indeed or can be in the category name if WoRMS/OBIS supports it.

moi90 commented 2 years ago

I think SETIT is very often the case: we could have gone further but that was more work and we did not want to do it. This can be a flag indeed or can be in the category name if WoRMS/OBIS supports it.

Yes, that is exactly my point. I guess, STETIT is the default for most annotations, therefore I would not make it explicit.

I really don't think that WoRMS/OBIS should support such category names. If it is decided that they should be used in EcoTaxa, they have to be handled in EcoTaxa. But I also don't think that it is necessary to handle them really. You have to keep the possibility to create sub-categories below the accepted taxa anyways. Objects in these categories are linked to the accepted taxon via the parent category.

This also circles back to the necessity of querying tools for EcoTaxa exports that allow to select "all Copepods (including sub-categories)", for example:

Then we need a post-processing function that can aggregate counts/concentrations at higher levels (to get all Copepoda for example) but such an aggregation function is needed even for the purely taxonomic aspect anyway (e.g. I want the biomass of all Crustacea per sample). And we kind of have such functions (in R and MATLAB); we just need to package them better and make them public. https://github.com/ecotaxa/ecotaxa/issues/456#issuecomment-691666231

Conceptually, the ON signs are meta-data for the assignment of an object to a category, and not sub-categories themselves. (I can see, however, that in the UI, it might be easiest to display them as sub-categories.)

Also, the ON signs do not apply to tag-like categories.

jiho commented 2 years ago

It is now confirmed that this is not WoRMS concern. If such qualifiers are added to a category name, they will be ignored and the taxon will be matched on the taxon part of the name.

So the question is whether we want to implement this in EcoTaxa, clearly as a tag added to a category in that case. There is a real difference between Copepoda INDET and Copepoda STETIT: in the first case, the work is finished, in the second it isn't. But of course INDET is not absolute: what is indeterminable beyond a certain point sometimes depends on the operator.

moi90 commented 2 years ago

But of course INDET is not absolute: what is indeterminable beyond a certain point sometimes depends on the operator.

That is very true. My feeling is that one project mostly gets sorted by one person, so when this person can not identify something, it is INDET. But when aggregating over multiple projects with multiple annotators, the sign loses significance. What I mean: The signs could be valuable when annotating (so that INDET objects do not appear again and again), but not so much when analyzing the data: If I want all Copepods, I don't care if some of them are classified further; if I want all Calanus finmarchicus, I gain nothing if I new that there are some Copepods INDET or STETIT that could also belong into this category.

But that already works with the existing category system; if queries could include all sub-categories (see above). Then "... > Copepod" could mean "Copepod STETIT" and "... > Copepod > Copepod INDET" could mean "Copepod INDET".