Closed Calamanthus closed 1 month ago
Another slightly different example... Gypsophila australis was identified by red as having 100% of its range in the current study area, and it is accepted by the gbif taxonomic backbone, but is regarded as a synonym to Gypsophila tubulosa by Flora of Australia and ALA, which is identified in bdbsa as a weed. GBIF does not recognise G.tubulosa and assigns it to the family level, so it cannot be a straight taxonomy fix. It may be easiest to just attribute this to genus (which is accepted) and effectively remove it from the data, as I can think of lots of problems with other potential solutions. In this case it won't matter if the taxa is lost as it is a weed, but I just hope there isn't a similar example for a species that is indigenous. I hate this stuff! I'll start compiling these examples into a fixes table.
I've added in a taxonomy_overrides
argument to make_taxonomy
. It follows a similar form to taxonomy_fixes
but is implemented differently (via left_join) due to the string-to-find and the string-to-replace being in different columns. However, another tack-on-fix at this point highlights the precarious nature of our current taxonomy workflow. I've changed the appropriate code in envPIA, but envClean::taxonomy_overrides currently only deals with the Hooded Plover issues. I'll leave this open until it has been tested more thoroughly.
Great, thanks. I'll add the other cases above. Where is envClean::taxonomy_overrides? I can't see it under ~/packages/envClean...
It's in the taxonomy_fixes.R file (in data-raw).
Ok, I was thinking it was its own script
Just looking at the code...if this is just changing the taxa in the lutaxa result, then the best key will presumedly not be relevant to the updated taxa. Do we want an override field in lutaxa to flag that the gbif taxonomy has been changed and the best key is no longer relevant?
The link to the taxonomic hierarchy to use (in taxa$taxonomy) is via the taxa
column. The key wasn't being used. I've now removed the key from lutaxa
output. I've tried to think of a way to make these fixes earlier so that the correct other attributes
(e.g. the status, matchtype and rank) come through, but haven't been able to work out how to implement it. So yes, perhaps worth flagging that those fields may be incorrect in the lutaxa results if a 'fix' has been made.
Closing as this has moved on and is implemented (differently) in the current version of make_taxonomy that calls galah::search_taxa instead of rgbif::name_backbone_checklist
I went to close this last week for the same reason, but left it open, as the overrides for galah are still not working properly. I will start a new issue for that when I have a chance.
We need some sort of override to deal with cases where the gbif taxonomic backbone is allocating the wrong taxonomy.
This includes the the Hooded Plover (old name) being allocated to Red-necked Phalarope: rgbif::name_backbone_checklist(c("Thinornis rubricollis","Phalaropus lobatus")) A tibble: 2 × 25 usageKey acceptedUsageKey scientificName canonicalName rank status confidence matchType kingdom phylum order family genus species kingdomKey phylumKey