ropensci / taxize

A taxonomic toolbelt for R
https://docs.ropensci.org/taxize
Other
269 stars 61 forks source link

Fail gracefully if species scientific name contains "/" #810

Closed olliewearn closed 4 years ago

olliewearn commented 4 years ago

Hi, here in Indochina we commonly use the scientific name "Muntiacus rooseveltorum/truongsonensis" for a muntjac species of uncertain taxonomic status (possibly 1+ species, we don't yet know). M. rooseveltorum and M. truongsonensis are listed separately on the IUCN Red List, but this is widely thought to be an incomplete or erroneous representation of reality.

I imagine there might be other cases of this shorthand using a forward slash for other species or species clusters of uncertain taxonomic status.

In any case, using a forward slash in a scientific name raises: Error: Not Found (HTTP 404).

Can scientific names written like this be made to fail safely?

Thanks Ollie

Session Info ```r - Session info ------------------------------------------------------------------------------ setting value version R version 3.6.1 (2019-07-05) os Windows 10 x64 system x86_64, mingw32 ui RStudio language (EN) collate English_United Kingdom.1252 ctype English_United Kingdom.1252 tz Asia/Bangkok date 2020-03-26 - Packages ---------------------------------------------------------------------------------- package * version date lib source abind 1.4-5 2016-07-21 [1] CRAN (R 3.6.0) ape 5.3 2019-03-17 [1] CRAN (R 3.6.1) assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.6.1) backports 1.1.4 2019-04-10 [1] CRAN (R 3.6.0) bold 0.9.0 2019-06-27 [1] CRAN (R 3.6.1) callr 3.3.2 2019-09-22 [1] CRAN (R 3.6.1) camtrapR * 1.2.4 2020-03-08 [1] Github (jniedballa/camtrapR@91aea37) cli 1.1.0 2019-03-19 [1] CRAN (R 3.6.1) codetools 0.2-16 2018-12-24 [1] CRAN (R 3.6.1) colorspace 1.4-1 2019-03-18 [1] CRAN (R 3.6.1) crayon 1.3.4 2017-09-16 [1] CRAN (R 3.6.1) crul 0.8.4 2019-08-02 [1] CRAN (R 3.6.1) curl 4.0 2019-07-22 [1] CRAN (R 3.6.1) data.table 1.12.2 2019-04-07 [1] CRAN (R 3.6.1) desc 1.2.0 2018-05-01 [1] CRAN (R 3.6.1) devtools * 2.2.1 2019-09-24 [1] CRAN (R 3.6.2) digest 0.6.20 2019-07-04 [1] CRAN (R 3.6.1) dplyr 0.8.3 2019-07-04 [1] CRAN (R 3.6.1) DT 0.9 2019-09-17 [1] CRAN (R 3.6.1) editData * 0.1.2 2017-10-07 [1] CRAN (R 3.6.3) ellipsis 0.3.0 2019-09-20 [1] CRAN (R 3.6.1) fastmap 1.0.1 2019-10-08 [1] CRAN (R 3.6.1) foreach 1.4.7 2019-07-27 [1] CRAN (R 3.6.1) fs 1.3.1 2019-05-06 [1] CRAN (R 3.6.1) ggplot2 * 3.2.1 2019-08-10 [1] CRAN (R 3.6.1) glue 1.3.1 2019-03-12 [1] CRAN (R 3.6.1) gtable 0.3.0 2019-03-25 [1] CRAN (R 3.6.1) htmltools 0.4.0 2019-10-04 [1] CRAN (R 3.6.1) htmlwidgets 1.5.1 2019-10-08 [1] CRAN (R 3.6.1) httpcode 0.2.0 2016-11-14 [1] CRAN (R 3.6.0) httpuv 1.5.2 2019-09-11 [1] CRAN (R 3.6.1) iNEXT * 2.0.19 2020-01-21 [1] Github (JohnsonHsieh/iNEXT@8831b2b) iterators 1.0.12 2019-07-26 [1] CRAN (R 3.6.1) jsonlite 1.6 2018-12-07 [1] CRAN (R 3.6.1) later 1.0.0 2019-10-04 [1] CRAN (R 3.6.1) lattice 0.20-38 2018-11-04 [1] CRAN (R 3.6.1) lazyeval 0.2.2 2019-03-15 [1] CRAN (R 3.6.1) magrittr 1.5 2014-11-22 [1] CRAN (R 3.6.1) MASS 7.3-51.4 2019-03-31 [1] CRAN (R 3.6.1) Matrix 1.2-17 2019-03-22 [1] CRAN (R 3.6.1) memoise 1.1.0 2017-04-21 [1] CRAN (R 3.6.1) mgcv 1.8-28 2019-03-21 [1] CRAN (R 3.6.1) mime 0.7 2019-06-11 [1] CRAN (R 3.6.0) miniUI 0.1.1.1 2018-05-18 [1] CRAN (R 3.6.3) munsell 0.5.0 2018-06-12 [1] CRAN (R 3.6.1) nlme 3.1-140 2019-05-12 [1] CRAN (R 3.6.1) overlap 0.3.2 2018-05-03 [1] CRAN (R 3.6.1) pillar 1.4.2 2019-06-29 [1] CRAN (R 3.6.1) pkgbuild 1.0.6 2019-10-09 [1] CRAN (R 3.6.1) pkgconfig 2.0.2 2018-08-16 [1] CRAN (R 3.6.1) pkgload 1.0.2 2018-10-29 [1] CRAN (R 3.6.1) plyr 1.8.4 2016-06-08 [1] CRAN (R 3.6.1) prettyunits 1.0.2 2015-07-13 [1] CRAN (R 3.6.1) processx 3.4.1 2019-07-18 [1] CRAN (R 3.6.1) promises 1.1.0 2019-10-04 [1] CRAN (R 3.6.1) ps 1.3.0 2018-12-21 [1] CRAN (R 3.6.1) purrr 0.3.2 2019-03-15 [1] CRAN (R 3.6.1) R6 2.4.0 2019-02-14 [1] CRAN (R 3.6.1) raster 3.0-7 2019-09-24 [1] CRAN (R 3.6.1) Rcpp 1.0.2 2019-07-25 [1] CRAN (R 3.6.1) remotes 2.1.0 2019-06-24 [1] CRAN (R 3.6.1) reshape * 0.8.8 2018-10-23 [1] CRAN (R 3.6.1) reshape2 1.4.3 2017-12-11 [1] CRAN (R 3.6.1) rgdal * 1.4-4 2019-05-29 [1] CRAN (R 3.6.1) rlang 0.4.0 2019-06-25 [1] CRAN (R 3.6.1) rprojroot 1.3-2 2018-01-03 [1] CRAN (R 3.6.1) rredlist 0.6.0 2020-01-28 [1] CRAN (R 3.6.3) rstudioapi 0.10 2019-03-19 [1] CRAN (R 3.6.1) scales 1.0.0 2018-08-09 [1] CRAN (R 3.6.1) secr 3.2.1 2019-06-03 [1] CRAN (R 3.6.1) sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.6.1) shiny 1.4.0 2019-10-10 [1] CRAN (R 3.6.1) sp * 1.3-1 2018-06-05 [1] CRAN (R 3.6.1) stringi 1.4.3 2019-03-12 [1] CRAN (R 3.6.0) stringr 1.4.0 2019-02-10 [1] CRAN (R 3.6.1) taxize * 0.9.92 2020-02-12 [1] CRAN (R 3.6.3) testthat 2.2.1 2019-07-25 [1] CRAN (R 3.6.1) tibble 2.1.3 2019-06-06 [1] CRAN (R 3.6.1) tidyselect 0.2.5 2018-10-11 [1] CRAN (R 3.6.1) triebeard 0.3.0 2016-08-04 [1] CRAN (R 3.6.1) urltools 1.7.3 2019-04-14 [1] CRAN (R 3.6.1) usethis * 1.5.1 2019-07-04 [1] CRAN (R 3.6.1) withr 2.1.2 2018-03-15 [1] CRAN (R 3.6.1) xml2 1.2.2 2019-08-09 [1] CRAN (R 3.6.1) xtable 1.8-4 2019-04-21 [1] CRAN (R 3.6.1) zoo 1.8-6 2019-05-28 [1] CRAN (R 3.6.1) [1] C:/Program Files/R/R-3.6.1/library ```
sckott commented 4 years ago

thanks @olliewearn - can you share your actual session info though, the session info above seems to be empty. Also share what taxize function or functions you are talking about

olliewearn commented 4 years ago

Ah, sorry. I've edited my comment above to include the session info.

I got the error when I ran this: iucn_summary("Muntiacus rooseveltorum/truongsonensis")

Thanks

sckott commented 4 years ago

thanks - i looked into it. in a way the error you got in my mind is the correct error: the taxon was not found after searching IUCN. If I got to the IUCN red list website and search with that taxon name with the slash, there are no results. thus, it makes sense that we get no results here as well.

You can always "escape" characters like the slash, e.g.,

x <- "Muntiacus rooseveltorum/truongsonensis"
iucn_summary(curl::curl_escape(x))

which doesn't cause the 404 error, but returns no results

olliewearn commented 4 years ago

I should've explained the context a bit better. I'm running iucn_summary over a list of 1000+ species (I imagine these kinds of bulk queries are common). I would've hoped that iucn_summary() would fail gracefully (i.e. give a Warning, not an Error) in any cases where it doesn't get a hit. It does this for species without slashes in the name, but doesn't for names with slashes.

> iucn_summary("Doesn't exist")
==  1 queries  ===============

Retrieving data for taxon 'Doesn't exist'

x  Not Found:  Doesn't exist
==  Results  =================

* Total: 1 
* Found: 0 
* Not Found: 1
$`Doesn't exist`
$`Doesn't exist`$status
[1] NA

$`Doesn't exist`$history
[1] NA

$`Doesn't exist`$distr
[1] NA

$`Doesn't exist`$trend
[1] NA

attr(,"class")
[1] "iucn_summary"
Warning message:
In iucn_summary.character("Doesn't exist") :
  taxa 'Doesn't exist' not found!
 Returning NAs!
> iucn_summary("Doesn't exist/exist")
==  1 queries  ===============

Retrieving data for taxon 'Doesn't exist/exist'

Error: Not Found (HTTP 404)
sckott commented 4 years ago

@olliewearn should work now, reinstall and try again