ropensci / taxize

A taxonomic toolbelt for R
https://docs.ropensci.org/taxize
Other
270 stars 61 forks source link

class2tree memory allocate error #840

Closed TaehyungKwon closed 4 years ago

TaehyungKwon commented 4 years ago

Dear all,

I'm currently having trouble using taxize class2tree function to generate taxonomy tree. I installed taxize using conda, and it's running under R 3.6.3. class2tree function keeps expanding the allocated memory (~100Gb) and finally ended with memory allocation error.

Error in order(full_rank_name_df$index) : Failed to allocate working memory for xtmp. Requested 1119790 * 8 bytes

R Script I used is as follows (species.vec contains 218 species names):

uids.species = get_uid(species.vec) df = data.frame(uids.species) class = classification(df$ids, db = "ncbi") class.dedup = unique(class) taxtree = class2tree(class.dedup) Error: cannot allocate vector of size 8.5 Mb

Could you kindly guide me how to solve this issue? Thank you.

sckott commented 4 years ago

thanks for opening the issue @TaehyungKwon - @trvinh can you have a look when you get a chance?

trvinh commented 4 years ago

Hi @TaehyungKwon , can you provide me the list of your taxa? I checked your code with my 296 taxa but couldn't reproduce the problem.

> taxtree = class2tree(class.dedup)
Removed species without classification.
> taxtree

Phylogenetic tree with 296 tips and 198 internal nodes.

Tip labels:
  Acanthamoeba castellanii, Acanthascus dawsoni, Acaryochloris marina, Acaryochloris marina MBIC11017, Acidomyces sp. 'richmondensis', Acinetobacter baylyi, ...
Node labels:
  cellular organisms, Archaea, Eukaryota, Bacteria, TACK group, Chlamydia, ...

Unrooted; includes branch lengths.
> 

Best, Vinh

TaehyungKwon commented 4 years ago

Hi @trvinh,

Yes, thank you. Here's the link for my deduplicated species list "class.dedup". https://drive.google.com/file/d/14yX2Hy9jlLIEz8weHMsOYvJ7lw43QPxW/view?usp=sharing I am sorry that I forgot to mention that class2tree worked on my machine at first. Meanwhile, I suspected that this is a compatibility issue with other R packages. I tried to clean install on a new conda env with r-essentials and r-taxize. However, it returned stringi.so missing error, which I solved with installing r-stringi. Even on the new env, my script keeps giving the same memory allocate error along with long running time. Thanks again, and have a nice day!

TaehyungKwon commented 4 years ago

I tried with other machines (3.6.3 on mac & 4.0.2 on windows) but it gives me the same error over and over again.

trvinh commented 4 years ago

@TaehyungKwon some of your species have the new taxonomy ranks (see #835 #838). It causes some troubles for class2tree. When I tested your class.dedup, even with only the first 30 taxa, it couldn't finish. But it could run without any problem with the new class2tree. So, please reinstall the pkg with remotes::install_github("ropensci/taxize") and try again. It worked for me :)

> taxtree <- class2tree(class.dedup)
> taxtree

Phylogenetic tree with 166 tips and 89 internal nodes.

Tip labels:
  Nannochloropsis oculata, Prorocentrum minimum, Macrocystis pyrifera, Nannochloropsis limnetica, Nannochloropsis oceanica, Messastrum gracile, ...
Node labels:
  Eukaryota, Rhodophyta, Haptophyta, Cryptophyceae, Sar, Viridiplantae, ...

Unrooted; includes branch lengths.
TaehyungKwon commented 4 years ago

It worked! Thanks!