gbif / doc-freshwater-data-publishing-guide

https://doi.org/10.35035/doc-sw3k-w725
Other
0 stars 0 forks source link

occurrenceStatus #31

Open DeboraArlt opened 3 days ago

DeboraArlt commented 3 days ago

feedback on https://docs.gbif-uat.org/freshwater-data-publishing-guide/en/#occurrence-datasets, Table 5, column "Comments":

The current text is misleading: "... absent should only be used if it is a true absence, e.g. effort was put into trying to detect the species and it was not detected. For example, if using targeted sampling to estimate species range, true absences can be identifed here, or if a species was previous noted at this location but was not there at the time of the sampling (potentially indicating species loss), then please indicate "absent" here."

The issue is the use of "true absence" - the true absence of a species (or organism) can (in most cases) not be identified. A survey can only look for a species and either detect och not detect it. It is the detection status that can be recorded in the data, not a true absence. About true absence we can only make probabilistic statements after inference based on an analysis of the detection data.

The underlying isuue is the ambigous definiton of occurrenceStatus "Statement about the presence or absence of a taxon at a location" since it is not clear whether what is meant by "absence of a taxon at a location" - I think this should be more clearly defined as "absence of a detection of a taxon at a location". Often, as in the comment in table 5 it is misleadingly used and potential wrongly interpreted as the taxon not being there (while only it was not detected).

We are a small working group working on definitions for concepts related to "absences"

My suggestion would be to reformulate the comment as: "... the value 'absent' should be used here to record that the sampling did not detect the species, i.e. effort was put into trying to detect the species and it was not detected. For example, if using targeted sampling to estimate species range, non-detections can be identifed here and used to estimate species range using a chosen model for inference, or if a species was previous noted at this location but was not there at the time of the sampling (potentially indicating species loss), then please indicate "absent" here."