Closed PatoUru closed 3 years ago
Hi @PatoUru,
I'm afraid you're getting this error because the tree-newick-gtdb
contains names that start with numeric characters. Can you please double-check that?
But apart form that, you're providing a tree file that describes your genomes to replace the tree that is meant to describe your gene clusters. The -t
adds a tree to replace the dendrogram in the center, which is identified with number 1 in this figure:
But I think what you want to do is to get this tree you have for your genomes appear in number 3 in the figure. Am I getting this right? In that case, you need to add this tree as a layers-order, which is explained here:
https://merenlab.org/2017/12/11/additional-data-tables/
Best,
Thank you very much for your answer!! tree-newick-gtdb: ((((((AMX14:0.05576653,AMX68:0.04860886)1.0000:0.10404042,(AMX15:0.05534578,AMX39:0.08002258)1.0000:0.07772235)1.0000:0.04868159,AMX55:0.13302416)1.0000:0.11794780,(MO118:0.31001016,MO16:0.17927527)1.0000:0.06426011)1.0000:0.17372416,(AMX9:0.25369015,(AMX57:0.16045843,RH21:0.24289284)0.8000:0.04991906)1.0000:0.20251593)1.0000:0.05813438,MO53:0.40098369,((AMX47:0.53311365,RH52:0.36555309)1.0000:0.08328510,((AMX56:0.29694858,MO66:0.28519523)0.6200:0.04982267,(RH38:0.18289250,RH43:0.16415077)1.0000:0.19762233)1.0000:0.05518750)1.0000:0.07183112);
What I would like is the tree that is meant to describe the gene clusters in my pangenome (number 1), I thought this was achieved by introducing the tree (tree-newick-gtdb)! :/ Without tree-newick-gtdb:
What would be the option to get number 1? Apologies for my ignorance
Hi @PatoUru, this helps. Thank you.
You don't have the tree shown as (1) in your pan because of this:
You have way too many gene clusters to generate that tree efficiently, so anvi'o skips trying to do that altogether.
But as far as I can se from your pangenome, these genomes are extremely distant from each other evidenced by the near-absent core, and very small number of single-copy core genes:
So I am not sure why are you using pangenomics to characterize the gene pool of these genomes. But if you re-run the anvi-pan-genome
command but this time with the flag --min-occurrence 2
, singletons will go away (so 43,531 gene clusters will likely reduce to less than 20,000) and you will get a tree in the middle this time.
In the pangenome I included 17 MAGs from 3 different ecosystems: AMX, RH and MO. All MAGs (or almost all) belong to the same class. My idea was to determine if there was a separation of MAGs according to the ecosystem, and also to determine if there were specific functions (singletons) in each ecosystem. The ANI value is less than 90 between mags, therefore they seem to be very different! Would you recommend another analysis? Maybe I'm going in the wrong direction!
I think this would be a good question to ask on anvi'o Slack channel to collect opinions on what would be the best way to compare these genomes. If you clarify your question, I'm sure you will hear some opinions.
I probably would go for a functional pangenome (which is described in the pangenomics tutorial) or comparing their metabolic potential (which is explained here, but will only be available in anvi'o v7
: https://merenlab.org/software/anvio/help/main/programs/anvi-estimate-metabolism/).
Hi all, I ran the following command line: $ anvi-display-pan -p Pangenome2021/Pangenome2021-PAN.db -g C-GENOMES.db -t tree-newick-gtdb
I got the following error
I performed the protein alignment from 17 MAGs in gtdbk, and then the phylogenetic tree was constructed in MEGA10 (exported in newick format). When Do you have any suggestion? Thank you very much in advance!! Pat