ropensci / taxize

A taxonomic toolbelt for R
https://docs.ropensci.org/taxize
Other
270 stars 61 forks source link

class2tree issue #873

Closed alemi055 closed 3 years ago

alemi055 commented 3 years ago

Hi, I have the same issue as mentioned here: https://github.com/ropensci/taxize/issues/838. I have a large number of taxIDs (> 11,000), but I keep getting this error after classification: `Error in if (currentIndex <= tmpEnv[[subList[i - 1]]]) { : argument is of length zero Calls: -> taxonomy_table_creator -> rank_indexing Execution halted``

I tried reinstalling with remotes::install_github("ropensci/taxize"), but no luck. I'm not sure what to do. If I have a small number of taxIDs, it works usually. Thanks!

trvinh commented 3 years ago

Hi @alemi055 It’s not the problem of using small or large number of taxIDs. I think that one (or some) of the 11,000 taxIDs has an „unusual“ taxonomy hierarchy. If it is ok, could you send me the list of those IDs (either as a downloadable link or directly to my email tran@bio.uni-frankfurt.de)? Best, Vinh

trvinh commented 3 years ago

hi @alemi055, I have fixed the function. It worked with more than 20 randomly taxon sets between 100 and 5000 species. For your full >11,000 taxa (just want to mention, some of them does not have a valid ncbi taxonomy ID), the rank indexing and taxonomy table creation steps worked, but the last step of the class2tree function (clustering the taxonomy hierarchies for creating phylo tree) could not finish when I tested, it just took ... forever. May be because my computer is not powerful enough to work with such a large number of taxa :D Btw, do you really need the phylo tree for more than 11,000 taxa? @sckott do you have any idea to make the clustering (as.phylo.hclust(hclust(taxdis, ...))) works for such a large data? Best, Vinh

alemi055 commented 3 years ago

Hi Vinh, Thank your for your reply! We are actually modifying our analysis and don’t need to create trees with as many sequences. We figured that this would take too many computational resources (as shown with your tries).

Should I reinstall taxize from github, now that you have modified the “class2tree” function?

Thank you, Audrée

De : Vinh Tran @.> Envoyé : 15 mai 2021 13:42 À : ropensci/taxize @.> Cc : Audrée Lemieux @.>; Mention @.> Objet : Re: [ropensci/taxize] class2tree issue (#873)

Attention : courriel externe | external email

hi @alemi055https://github.com/alemi055, I have fixed the function. The rank indexing and taxonomy table creation steps works for your 11,000 taxa (just want to mention, some of them does not have a valid ncbi taxonomy IDs), but the last step of the class2tree function (clustering the taxonomy hierarchies for creating phylo tree) could not finish when I tested, it just took ... forever. May be because my computer is not powerful enough to work with such a large number of taxa :D Btw, do you really need the phylo tree for more than 11,000 taxa? @sckotthttps://github.com/sckott do you have any idea to make the clustering (as.phylo.hclust(hclust(taxdis, ...))) works for such a large data? Best, Vinh

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/ropensci/taxize/issues/873#issuecomment-841699175, or unsubscribehttps://github.com/notifications/unsubscribe-auth/APIOHGXNAJJFF4NVPAEDFGTTN2W6NANCNFSM442RWU6A.

trvinh commented 3 years ago

Hi, You still need to wait for @sckott to accept my pull request, then you can reinstall it from github. I think it will not take long. Best, Vinh

Percud commented 3 years ago

Hi @trvinh,

I have tired to install the to-be-committed version from your branch. However, class2tree still fails for particular taxa combinations: taxa <- c("Oryctolagus cuniculus", "Galeopterus variegatus", "Paramormyrops kingsleyae", "Sinocyclocheilus rhinocerous") cl <- classification(taxa, db='ncbi') class2tree(cl)

Error in if (currentIndex <= tmpEnv[[subList[i - 1]]]) { :

argument is of length zero

class2tree(cl[1:3])

Phylogenetic tree with 3 tips and 2 internal nodes.

...

class2tree(cl[2:4])

Phylogenetic tree with 3 tips and 2 internal nodes.

...

trvinh commented 3 years ago

Hi @Percud may I asked, where did you get the "to-be-committed version"? I've just tested with your taxa, it worked fine

> taxa <- c("Oryctolagus cuniculus", "Galeopterus variegatus", "Paramormyrops kingsleyae", "Sinocyclocheilus rhinocerous")
> cl <- classification(taxa, db='ncbi')
No ENTREZ API key provided
 Get one via taxize::use_entrez()
See https://ncbiinsights.ncbi.nlm.nih.gov/2017/11/02/new-api-keys-for-the-e-utilities/
══  4 queries  ═══════════════

Retrieving data for taxon 'Oryctolagus cuniculus'

✓  Found:  Oryctolagus+cuniculus

Retrieving data for taxon 'Galeopterus variegatus'

✓  Found:  Galeopterus+variegatus

Retrieving data for taxon 'Paramormyrops kingsleyae'

✓  Found:  Paramormyrops+kingsleyae

Retrieving data for taxon 'Sinocyclocheilus rhinocerous'

✓  Found:  Sinocyclocheilus+rhinocerous
══  Results  ═════════════════

• Total: 4 
• Found: 4 
• Not Found: 0
No ENTREZ API key provided
 Get one via taxize::use_entrez()
See https://ncbiinsights.ncbi.nlm.nih.gov/2017/11/02/new-api-keys-for-the-e-utilities/
> tree <- class2tree(cl)
Get all ranks and their taxIDs
Align taxonomy hierarchies...
Taxonomy alignment done!
Calculate distance matrix
Add node labels
> plot(tree)

image Best, Vinh

Percud commented 3 years ago

Hi @Percud may I asked, where did you get the "to-be-committed version"? I've just tested with your taxa, it worked fine

Hi @trvinh, I have installed it from your github (remotes::install_github("trvinh/taxize"). Your are right: after resuming the R session it worked with the example provided and also with my full set of 859 taxa. Thank you for your answer and please excuse the false alarm. Best, Riccardo