merenlab / anvio

An analysis and visualization platform for 'omics data
http://merenlab.org/software/anvio
GNU General Public License v3.0
443 stars 145 forks source link

anvi-gen-phylogenomic-tree #690

Closed Arkadiy-Garber closed 6 years ago

Arkadiy-Garber commented 6 years ago

Hi Meren,

anvi-self-test --version Anvi'o version ...............................: 3 Profile DB version ...........................: 20 Contigs DB version ...........................: 9 Pan DB version ...............................: 5 Samples information DB version ...............: 2 Genome data storage version ..................: 1 Auxiliary data storage version ...............: 4 Anvi'server users data storage version .......: 1

I am using a MacOS; I believe I installed Anvio using HomeBrew

I am having trouble with "anvi-gen-phylogenomic-tree". I generated a fasta file of concatenated single-copy proteins (Campbell et al) using the command:

anvi-get-sequences-for-hmm-hits --external-genomes external-genomes.txt \ -o concatenated-proteins.fa \ --hmm-source Campbell_et_al \ --return-best-hit \ --get-aa-sequences \ --concatenate

resulting in an MSA fasta file: Now trying to create a phylogenetic tree using:

anvi-gen-phylogenomic-tree --fasta-file concatenated-proteins.fa -o PhyloTree.txt But this is resulting in:

Input aligment file path .....................: /Users/arkadiygarber/Desktop/combined_bins-renamed/concatenated-proteins.fa Output file path .............................: /Users/arkadiygarber/Desktop/combined_bins-renamed/tree Alignment names ..............................: JDF1362A_Bin_28_fixed, JDF1362B_Bin_20_fixed, JDF1362A_Bin_29_fixed, JDF1362B_Bin_14_fixed, JDF1362A_Bin_10_fixed, JDF1362A_Bin_21_fixed, JDF1362A_Bin_14_fixed, JDF1362A_Bin_27_fixed, JDF1362B_Bin_17_fixed, JDF1362A_Bin_8_fixed, JDF1362A_Bin_22_fixed, JDF1362A_Bin_5_fixed, JDF1362B_Bin_30_fixed, JDF1362B_Bin_16_fixed, JDF1362B_Bin_26_fixed, JDF1362A_Bin_7_fixed, JDF1362A_Bin_1_fixed, JDF1362A_Bin_12_fixed, JDF1362B_Bin_21_fixed, JDF1362A_Bin_2_fixed, JDF1362A_Bin_18_fixed, JDF1362A_Bin_15_fixed, JDF1362B_Bin_25_fixed, JDF1362A_Bin_3_fixed, JDF1362A_Bin_6_fixed, JDF1362A_Bin_13_fixed, JDF1362B_Bin_28_fixed, JDF1362B_Bin_31_fixed, JDF1362A_Bin_9_fixed, JDF1362A_Bin_25_fixed, JDF1362B_Bin_27_fixed, JDF1362B_Bin_12_fixed, JDF1362B_Bin_22_fixed, JDF1362B_Bin_19_fixed, JDF1362A_Bin_16_fixed, JDF1362B_Bin_23_fixed, JDF1362B_Bin_18_fixed, JDF1362A_Bin_17_fixed, JDF1362B_Bin_13_fixed Alignment sequence length ....................: 64,972 Version ......................................: FastTree Version 2.1.10 SSE3 Alignment ....................................: standard input Info .........................................: Amino acid distances: BLOSUM45 Joins: balanced Support: SH-like 1000 Search .......................................: Normal +NNI +SPR (2 rounds range 10) +ML-NNI opt-each=1 TopHits ......................................: 1.00*sqrtN close=default refresh=0.80 ML Model .....................................: Jones-Taylor-Thorton, CAT approximation with 20 rate categories Info .........................................: Ignored unknown character X (seen 16350 times)

File/Path Error: Your tree doesn't seem to be properly formatted. Here is what ETE had to say
about this: 'Unexisting tree file or Malformed newick tree structure.'. Pity :/

The help page states that the input fasta file should be: Concatenated aligment files exported using anvi-export-pc-aligments, which, I believe, is part of the pangenomic workflow. However, I would like to, essentially, de-replicate some bins from separate assemblies, using single-copy genes.

Any thoughts on what might be causing this error?

Thanks and sorry if this is caused by some error on my end. I am having RAxML make me a tree from these concatenated proteins as a back-up, though I would like to utilize Anvio's awesome interactive mode to view it.

Arkadiy

meren commented 6 years ago

We hope this is fixed in v5 :)

Arkadiy-Garber commented 6 years ago

Thank you, Meren. Looking forward to playing around with the updated version! Your dedication to high quality in bioinformatics is much appreciated :)