Closed peterdesmet closed 5 years ago
Thanks @peterdesmet. I was already doing steps from 1. to 5. About 6th point:
Remove genera and above (i.e. no species info, is the case for RINSE)
In case we have no species info for some checklists, then we start to count them from the rank they have (genus in case of RINSE, as no species info is provided). Unfortunately I don't understand what you mean with remove genera and above.
About 7th point:
Group by accepted taxon?? So we don’t count synonyms of same species.
I agree with you. And considering the fact that we are working based on a unified checklist (fictive at this moment), the problem of having two synonyms of the same species from two different checklists with different distribution/description will never occur! This problem will be tackled by unifying the checklists.
About 8:
Group by species?? So we don’t count infraspecific taxa
Yes, for sure. No group_by
for subspecies.
But how do we tackle the case shown here below?
rank | taxonKey | scientificName | speciesKey | first observed | last_observed | ... |
---|---|---|---|---|---|---|
SUBSPECIES | 111 | subspecies1 | 200 | 2006 | 2010 | ... |
SUBSPECIES | 112 | subspecies2 | 200 | 2008 | 2015 | ... |
My idea is to take the minimum between the first_observed
and the maximum of last_observed
if the two periods overlap.
About 3rd filter:
occurrenceStatus = present
@timadriaens and @qgroom : should we filter by occurrenceStatus = present
as @peterdesmet wrote or should we also consider occurrenceStatus == doubtful
? For example, in Manual of Alien Plants in Belgium there are 31 taxa with occurrenceStatus == doubtful
from 28 different species:
species |
---|
Elatine alsinastrum Myricaria germanica Centaurea alba Pilosella brachiata Potentilla argentea x inclinata Prunus fruticans Linaria simplex Triticum monococcum Triticum turgidum Triticum aestivum Sporobolus virginicus Hornungia procumbens Chenopodium preissmannii Amsinckia intermedia Epilobium novae-civitatis Epilobium interjectum Picea abies Pinus sylvestris Pinus rigida Pinus pinaster Galium rubioides Verbascum interjectum Hemerocallis lilioasphodelus Symphoricarpos microphyllus x orbiculatus Cerastium arvense subsp. arvense x tomentosum Cherleria laricifolia Narcissus pseudonarcissus Narcissus incomparabilis
Or do we go even more relaxed on this constraint by taking into account all taxa with occurrenceStatus != absent
(not absent
)? Click here to know more about the meaning of this terms. By the way, up to now I have encountered only taxa with status present
or doubtful
.
Another consideration about 7th point:
there are 363 synonyms in the checklists Manual of Alien Plants, alien macroinvertebrates and non-native freshwater fishes, whose 353 only in the Manual of the Alien Plants. After filtering only the IAS in Belgium, as explained above, 359 are left. It would be not a problem at all if their accepted
(and relative acceptedKey
) point to taxa not present in these checklists. However, there is a group of 27 taxa whose acceptedKey
is present as key
: this means that these species could be count twice. It is a small group, but we should tackle this problem. The point is: should it be tackled while unifying the checklists or while building indicators?
Some of your doubtful species are certainly present e.g. Pinus sylvestris, Narcissus pseudonarcissus, so perhaps it is doubtful that they are naturalized. For P. sylvestris the checklist says it is doubtful only for Brussels. He must mean it is doubtfully naturalized, but cause I'm 100% sure it is present.
Picea abies, Pinus rigida and Pinus pinaster are probably similar cases. They are planted, but it is doubtful if they are escaping.
For N. pseudonarcissus the checklist is only refering to Narcissus pseudonarcissus L. subsp. major . Not N. pseudonarcissus in general. It is an important distinction as N. pseudonarcissus is a native species, but not this subspecies.
The rest are probably genuinely doubtful species. That is they are doubtfully present.
@damianooldoni regarding synonyms for which the accepted taxon is ALSO in the checklist, we'll have to take the following approach in building the unified checklist:
verifiedKey
verifiedKey
Merging that information is no different than merging information for the same (accepted) taxon appearing on two checklists... we just haven't decided yet HOW we will merge that info. 😄
Based on developments of unified checklist, we can be sure we group by verificationKey
. We can then close this issue.
This issue describes how data will be filtered to get to the data frame described in #17. Some of these will be tackled in the unified checklist.
Filters: