Closed alemi055 closed 3 years ago
Hi @alemi055 It’s not the problem of using small or large number of taxIDs. I think that one (or some) of the 11,000 taxIDs has an „unusual“ taxonomy hierarchy. If it is ok, could you send me the list of those IDs (either as a downloadable link or directly to my email tran@bio.uni-frankfurt.de)? Best, Vinh
hi @alemi055,
I have fixed the function. It worked with more than 20 randomly taxon sets between 100 and 5000 species. For your full >11,000 taxa (just want to mention, some of them does not have a valid ncbi taxonomy ID), the rank indexing and taxonomy table creation steps worked, but the last step of the class2tree
function (clustering the taxonomy hierarchies for creating phylo tree) could not finish when I tested, it just took ... forever. May be because my computer is not powerful enough to work with such a large number of taxa :D Btw, do you really need the phylo tree for more than 11,000 taxa?
@sckott do you have any idea to make the clustering (as.phylo.hclust(hclust(taxdis, ...))
) works for such a large data?
Best,
Vinh
Hi Vinh, Thank your for your reply! We are actually modifying our analysis and don’t need to create trees with as many sequences. We figured that this would take too many computational resources (as shown with your tries).
Should I reinstall taxize from github, now that you have modified the “class2tree” function?
Thank you, Audrée
De : Vinh Tran @.> Envoyé : 15 mai 2021 13:42 À : ropensci/taxize @.> Cc : Audrée Lemieux @.>; Mention @.> Objet : Re: [ropensci/taxize] class2tree issue (#873)
Attention : courriel externe | external email
hi @alemi055https://github.com/alemi055, I have fixed the function. The rank indexing and taxonomy table creation steps works for your 11,000 taxa (just want to mention, some of them does not have a valid ncbi taxonomy IDs), but the last step of the class2tree function (clustering the taxonomy hierarchies for creating phylo tree) could not finish when I tested, it just took ... forever. May be because my computer is not powerful enough to work with such a large number of taxa :D Btw, do you really need the phylo tree for more than 11,000 taxa? @sckotthttps://github.com/sckott do you have any idea to make the clustering (as.phylo.hclust(hclust(taxdis, ...))) works for such a large data? Best, Vinh
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/ropensci/taxize/issues/873#issuecomment-841699175, or unsubscribehttps://github.com/notifications/unsubscribe-auth/APIOHGXNAJJFF4NVPAEDFGTTN2W6NANCNFSM442RWU6A.
Hi, You still need to wait for @sckott to accept my pull request, then you can reinstall it from github. I think it will not take long. Best, Vinh
Hi @trvinh,
I have tired to install the to-be-committed version from your branch. However, class2tree still fails for particular taxa combinations: taxa <- c("Oryctolagus cuniculus", "Galeopterus variegatus", "Paramormyrops kingsleyae", "Sinocyclocheilus rhinocerous") cl <- classification(taxa, db='ncbi') class2tree(cl)
class2tree(cl[1:3])
class2tree(cl[2:4])
Hi @Percud may I asked, where did you get the "to-be-committed version"? I've just tested with your taxa, it worked fine
> taxa <- c("Oryctolagus cuniculus", "Galeopterus variegatus", "Paramormyrops kingsleyae", "Sinocyclocheilus rhinocerous")
> cl <- classification(taxa, db='ncbi')
No ENTREZ API key provided
Get one via taxize::use_entrez()
See https://ncbiinsights.ncbi.nlm.nih.gov/2017/11/02/new-api-keys-for-the-e-utilities/
══ 4 queries ═══════════════
Retrieving data for taxon 'Oryctolagus cuniculus'
✓ Found: Oryctolagus+cuniculus
Retrieving data for taxon 'Galeopterus variegatus'
✓ Found: Galeopterus+variegatus
Retrieving data for taxon 'Paramormyrops kingsleyae'
✓ Found: Paramormyrops+kingsleyae
Retrieving data for taxon 'Sinocyclocheilus rhinocerous'
✓ Found: Sinocyclocheilus+rhinocerous
══ Results ═════════════════
• Total: 4
• Found: 4
• Not Found: 0
No ENTREZ API key provided
Get one via taxize::use_entrez()
See https://ncbiinsights.ncbi.nlm.nih.gov/2017/11/02/new-api-keys-for-the-e-utilities/
> tree <- class2tree(cl)
Get all ranks and their taxIDs
Align taxonomy hierarchies...
Taxonomy alignment done!
Calculate distance matrix
Add node labels
> plot(tree)
Best, Vinh
Hi @Percud may I asked, where did you get the "to-be-committed version"? I've just tested with your taxa, it worked fine
Hi @trvinh, I have installed it from your github (remotes::install_github("trvinh/taxize"). Your are right: after resuming the R session it worked with the example provided and also with my full set of 859 taxa. Thank you for your answer and please excuse the false alarm. Best, Riccardo
Hi, I have the same issue as mentioned here: https://github.com/ropensci/taxize/issues/838. I have a large number of taxIDs (> 11,000), but I keep getting this error after classification: `Error in if (currentIndex <= tmpEnv[[subList[i - 1]]]) { : argument is of length zero Calls: -> taxonomy_table_creator -> rank_indexing
Execution halted``
I tried reinstalling with
remotes::install_github("ropensci/taxize")
, but no luck. I'm not sure what to do. If I have a small number of taxIDs, it works usually. Thanks!