ropensci / rfishbase

R interface to the fishbase.org database
https://docs.ropensci.org/rfishbase
111 stars 40 forks source link

`left_join` instead of `right_join` to avoid reordering of `validate_names()` output #205

Closed oharac closed 3 years ago

oharac commented 3 years ago

Description

In validate_names() (in synonyms.R) I replaced the right_join(tmp, by = 'synonym') with left_join(x = tmp, y = ., by = 'synonym'). Also added a dplyr:: in front of some of the function calls, and changed unique() to distinct() which I believe is more appropriate for data.frames/tidyverse.

Related Issue

Fixes #204

Example

library(rfishbase)

spp <- c("Acanthogobius lactipes",
         "Acanthogobius luridus",
         "Acanthogobius stigmothonus",
         "Acantholabrus palloni",
         "Acanthopsetta nadeshnyi",
         "Acanthurus achilles")

### old version
val_names <- validate_names(spp)
data.frame(spp, val_names)
#>                          spp               val_names
#> 1     Acanthogobius lactipes  Acanthogobius lactipes
#> 2      Acanthogobius luridus   Acanthogobius luridus
#> 3 Acanthogobius stigmothonus   Acantholabrus palloni
#> 4      Acantholabrus palloni Acanthopsetta nadeshnyi
#> 5    Acanthopsetta nadeshnyi     Acanthurus achilles
#> 6        Acanthurus achilles                    <NA>

### redefine function:
validate_names <- function(species_list,
                           server = getOption("FISHBASE_API", "fishbase"),
                           version = rfishbase:::get_latest_release(),
                           db = rfishbase:::default_db(),
                           ...){
  rx <- "^[sS]ynonym$|^accepted name$"
  tmp <- data.frame(synonym  = species_list, stringsAsFactors = FALSE)
  synonyms(species_list, server = server, version = version, db = db) %>%
    dplyr::collect() %>%
    dplyr::mutate(Species = ifelse(grepl(rx, Status), Species, NA)) %>%
    # group by input taxon, remove NAs
    dplyr::group_by(synonym) %>%
    dplyr::filter(!is.na(Species)) %>%
    dplyr::ungroup() %>%
    dplyr::select(synonym, Species) %>% 
    dplyr::distinct() %>%
    # left_join to tmp to preserve species order from tmp
    dplyr::left_join(x = tmp, y = ., by = "synonym") %>% 
    dplyr::pull(Species)
}
### new version
val_names <- validate_names(spp)
data.frame(spp, val_names)
#>                          spp               val_names
#> 1     Acanthogobius lactipes  Acanthogobius lactipes
#> 2      Acanthogobius luridus   Acanthogobius luridus
#> 3 Acanthogobius stigmothonus                    <NA>
#> 4      Acantholabrus palloni   Acantholabrus palloni
#> 5    Acanthopsetta nadeshnyi Acanthopsetta nadeshnyi
#> 6        Acanthurus achilles     Acanthurus achilles

Created on 2021-04-07 by the reprex package (v1.0.0)

cboettig commented 3 years ago

:pray: Thanks!