ropensci / taxa

taxonomic classes for R
https://docs.ropensci.org/taxa
Other
48 stars 12 forks source link

Assign taxon_rank based on n_supertaxa #199

Open brendanf opened 5 years ago

brendanf commented 5 years ago

I have a delimited classification with set ranks, of the form [rootrank];[kingdom];[phylum];[class];[order];[family];[genus], e.g., Root;Fungi;Ascomycota;Eurotiomycetes;Eurotiales;Trichocomaceae;Hamigera. I can parse this into a Taxonomy using extract_tax_data(), but I'm not sure if/how I can assign the ranks. It would be nice if I could supply c('rootrank', 'kingdom', 'phylum', 'class', 'order', 'family', 'genus') as an argument to extract_tax_data(), but there doesn't seem to be any relevant argument. taxon_ranks() <- doesn't seem to be defined either.

I've accomplished my goal here (using parse_tax_data() to simplify the reprex), but it seems needlessly complex for what must be a fairly common operation:

``` r library(taxa) sessionInfo() #> R version 3.4.1 (2017-06-30) #> Platform: x86_64-pc-linux-gnu (64-bit) #> Running under: Ubuntu 18.04.2 LTS #> #> Matrix products: default #> BLAS: /home/brendan/miniconda3/envs/oueme-dev/lib/R/lib/libRblas.so #> LAPACK: /home/brendan/miniconda3/envs/oueme-dev/lib/R/lib/libRlapack.so #> #> locale: #> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C #> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 #> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 #> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C #> [9] LC_ADDRESS=C LC_TELEPHONE=C #> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C #> #> attached base packages: #> [1] stats graphics grDevices utils datasets methods base #> #> other attached packages: #> [1] taxa_0.3.2 #> #> loaded via a namespace (and not attached): #> [1] Rcpp_1.0.0 rstudioapi_0.9.0 knitr_1.21 magrittr_1.5 #> [5] tidyselect_0.2.5 R6_2.4.0 rlang_0.3.1 stringr_1.4.0 #> [9] highr_0.7 dplyr_0.8.0.1 tools_3.4.1 xfun_0.5 #> [13] htmltools_0.3.6 yaml_2.2.0 digest_0.6.18 assertthat_0.2.0 #> [17] tibble_2.0.1 crayon_1.3.4 purrr_0.3.1 glue_1.3.0 #> [21] evaluate_0.13 rmarkdown_1.11 stringi_1.3.1 compiler_3.4.1 #> [25] pillar_1.3.1 jsonlite_1.6 pkgconfig_2.0.2 taxdata <- c("Root;Fungi;Basidiomycota;Agaricomycetes;Hymenochaetales;Hymenochaetaceae;Fuscoporia", "Root;Fungi;Basidiomycota;Microbotryomycetes;Microbotryales;Microbotryaceae;Microbotryum", "Root;Fungi;Ascomycota;Dothideomycetes;Botryosphaeriales;Botryosphaeriaceae;Microdiplodia") tax <- parse_tax_data(taxdata, class_sep = ";") ranks <- lapply(c("rootrank", "kingdom", "phylum", "class", "order", "family", "genus"), taxon_rank) rank_idx <- tax$n_supertaxa() + 1 for (i in seq_along(rank_idx)) { tax$taxa[[i]]$rank <- ranks[[rank_idx[i]]] } tax$taxa #> $b #> #> name: Root #> rank: rootrank #> id: none #> authority: none #> #> $c #> #> name: Fungi #> rank: kingdom #> id: none #> authority: none #> #> $d #> #> name: Basidiomycota #> rank: phylum #> id: none #> authority: none #> #> $e #> #> name: Ascomycota #> rank: phylum #> id: none #> authority: none #> #> $f #> #> name: Agaricomycetes #> rank: class #> id: none #> authority: none #> #> $g #> #> name: Microbotryomycetes #> rank: class #> id: none #> authority: none #> #> $h #> #> name: Dothideomycetes #> rank: class #> id: none #> authority: none #> #> $i #> #> name: Hymenochaetales #> rank: order #> id: none #> authority: none #> #> $j #> #> name: Microbotryales #> rank: order #> id: none #> authority: none #> #> $k #> #> name: Botryosphaeriales #> rank: order #> id: none #> authority: none #> #> $l #> #> name: Hymenochaetaceae #> rank: family #> id: none #> authority: none #> #> $m #> #> name: Microbotryaceae #> rank: family #> id: none #> authority: none #> #> $n #> #> name: Botryosphaeriaceae #> rank: family #> id: none #> authority: none #> #> $o #> #> name: Fuscoporia #> rank: genus #> id: none #> authority: none #> #> $p #> #> name: Microbotryum #> rank: genus #> id: none #> authority: none #> #> $q #> #> name: Microdiplodia #> rank: genus #> id: none #> authority: none ``` Created on 2019-04-15 by the [reprex package](https://reprex.tidyverse.org) (v0.2.1)
zachary-foster commented 5 years ago

Hello @brendanf.

Adding an argument to assign the ranks when parsing using extract_tax_data is a good idea. I will look into that. Thanks!

In regards to taxon_ranks() <-: We are working on adding getter/setters for things like taxon ranks, but it is part of a larger rewrite of some of the code, so those additions are not in the master branch yet. If you want to try out what we are working on, you can install the eval branch like so:

devtools::install_github("ropensci/taxa@eval")

Then, this works:

library(taxa)
taxdata <-
  c("Root;Fungi;Basidiomycota;Agaricomycetes;Hymenochaetales;Hymenochaetaceae;Fuscoporia",
    "Root;Fungi;Basidiomycota;Microbotryomycetes;Microbotryales;Microbotryaceae;Microbotryum",
    "Root;Fungi;Ascomycota;Dothideomycetes;Botryosphaeriales;Botryosphaeriaceae;Microdiplodia")
tax <- parse_tax_data(taxdata, class_sep = ";")
my_ranks <- c("rootrank", "kingdom", "phylum", "class", "order", "family", "genus")
taxon_ranks(tax) <- my_ranks[n_supertaxa(tax) + 1]
taxon_ranks(tax)
#>          b          c          d          e          f          g 
#> "rootrank"  "kingdom"   "phylum"   "phylum"    "class"    "class" 
#>          h          i          j          k          l          m 
#>    "class"    "order"    "order"    "order"   "family"   "family" 
#>          n          o          p          q 
#>   "family"    "genus"    "genus"    "genus"

Created on 2019-04-15 by the reprex package (v0.2.1)