ropensci / taxadb

:package: Taxonomic Database
https://docs.ropensci.org/taxadb
Other
43 stars 13 forks source link

`clean_names()` does not remove 'spp.' epithet #120

Closed mattiaghilardi closed 4 months ago

mattiaghilardi commented 5 months ago

Hello,

clean_names() allows to remove some species epithet designations, such as sp., sp1, sps, but not spp., which indicates several species and I commonly find when working with fish survey data.

taxadb::clean_names(c("Homo sp.", "Homo sp", "Homo sp1", "Homo sps", "Homo spp.", "Homo spp"))
#> [1] "homo"     "homo"     "homo"     "homo"     "homo spp"     "homo spp"

Created on 2024-04-10 with reprex v2.1.0

Would you consider adding this feature?

This modification to the internal drop_sp.() should do it:

drop_sp. <- function(x){
  # drop: cladophora sp2, cladophora sp., cladophora sps, cladophora sp
  x <- stringi::stri_replace_all_regex(x, "\\ssp[s\\d\\.]?$", "")
  # drop: cladophora spp, cladophora spp.
  stringi::stri_replace_all_regex(x, "\\sspp[\\.]?$", "")
}
cboettig commented 4 months ago

thanks, nice catch. That's a great idea. A PR would be welcome.