ropensci / taxizedb

Tools for Working with Taxonomic SQL Databases
Other
30 stars 7 forks source link

Downstream intermediates argument? #53

Closed sagesteppe closed 3 years ago

sagesteppe commented 3 years ago

Hi all, first off thanks for the tremendous work on both taxize and taxizedb. This is not technically an issue but an ask about an important function.

I was wondering if the 'intermediates' argument is able to be incorporated into downstream function in taxizedb, or whether I am able to specify my taxize queries to the local SQLite database? This function is very useful, but quite slow on NCBI after having to incorporate some kind of sys.sleep method (I still need this despite my entrez key).

I am working on trying to assign all genera to intermediate clades for a few large families, and am having a heck of a time trying to create a nice dataframe of relationships.

sckott commented 3 years ago

thanks for the question.

Yes, NCBI rate limiting is a pain, and via taxize making web requests can take a while depending on the query.

@arendsee wrote the ncbi downstream code. I don't see a way to get intermediate results with that code, but maybe he'll drop in here to comment ...

You could in the meantime do something custom since it is so fast in taxizedb,

library(taxizedb)
downstream_inter <- function(id, downto_inter, downto) {
  temp <- downstream(id, downto=downto_inter)
  stats::setNames(lapply(temp[[1]]$childtaxa_id, function(w) {
    downstream(w, downto=downto)[[1]]
  }), temp[[1]]$childtaxa_name)
}
id <- 28641 # genus=Bombus
downstream_inter(id, downto_inter = "subgenus", downto = "species")

Which gives a named list, named by the downto_inter taxon names, where each thing has a data.frame of its children

sagesteppe commented 3 years ago

Thanks Scott,

I think this will work well enough for my purposes. I believe as you allude to somewhere in the documentation, taxa of uncertain placement bring some complications wherever they are found.

sckott commented 3 years ago

I think this will work well enough for my purposes

great, closing now, but we can reopen if it doesn't work for you