biobakery / phylophlan

Precise phylogenetic analysis of microbial isolates and genomes from metagenomes
https://huttenhower.sph.harvard.edu/phylophlan
MIT License
128 stars 33 forks source link

[e] build_gene_tree crashed #47

Closed Hocnonsense closed 4 years ago

Hocnonsense commented 4 years ago

My command line:

phylophlan_write_default_configs.sh
phylophlan \
    -i ../../F-06-MAG/03_modify/7_final/ \
    -d phylophlan --diversity high -f supertree_aa.cfg \
    --genome_extension .fa \
    --maas ~/Software/anaconda3/envs/phylophlan/lib/python3.9/site-packages/phylophlan/phylophlan_substitution_models/phylophlan.tsv \
    --verbose

stderr:


[e] Command '['~/Software/anaconda3/envs/phylophlan/bin/FastTree', '-quiet', '-pseudo', '-spr', '4', '-mlacc', '2', '-slownni', '-fastest', '-no2nd', '-mlnni', '4', '-lg', '-out', '~/Work/2020-09-MgAffect/Analyze/phylophlan/7_final_phylophlan/tmp/gene_tree1/p0197.tre', '7_final_phylophlan/tmp/sub/p0197.aln']' returned non-zero exit status 1.

[e] error while building gene tree
    command_line: ~/Software/anaconda3/envs/phylophlan/bin/FastTree -quiet -pseudo -spr 4 -mlacc 2 -slownni -fastest -no2nd -mlnni 4 -lg -out 7_final_phylophlan/tmp/gene_tree1/p0197.tre 7_final_phylophlan/tmp/sub/p0197.aln
           stdin: None
          stdout: None
             env: {'CONDA_SHLVL': '3', 'LS_COLORS': 'rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=01;05;37;41:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.zst=01;31:*.tzst=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.wim=01;31:*.swm=01;31:*.dwm=01;31:*.esd=01;31:*.jpg=01;35:*.jpeg=01;35:*.mjpg=01;35:*.mjpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=01;36:*.au=01;36:*.flac=01;36:*.m4a=01;36:*.mid=01;36:*.midi=01;36:*.mka=01;36:*.mp3=01;36:*.mpc=01;36:*.ogg=01;36:*.ra=01;36:*.wav=01;36:*.oga=01;36:*.opus=01;36:*.spx=01;36:*.xspf=01;36:', 'CONDA_EXE': '~/Software/anaconda3/bin/conda', 'SSH_CONNECTION': '192.168.137.1 63214 192.168.137.128 22', '_': '~/Software/anaconda3/envs/phylophlan/bin/phylophlan', 'LANG': 'en_US.UTF-8', 'HISTCONTROL': 'ignoredups', 'HOSTNAME': 'localhost.localdomain', 'OLDPWD': '../../', 'COLORTERM': 'truecolor', 'CONDA_PREFIX': '~/Software/anaconda3/envs/phylophlan', 'DOTNET_ROOT': '/usr/lib64/dotnet', '_CE_M': '', 'XDG_SESSION_ID': '2', 'DOTNET_BUNDLE_EXTRACT_BASE_DIR': '~/.cache/dotnet_bundle_extract', 'USER': 'hwrn', 'CONDA_PREFIX_1': '~/Software/anaconda3', 'CONDA_PREFIX_2': '~/Software/anaconda3/envs/mylib', 'CONDA_PYTHON_EXE': '~/Software/anaconda3/bin/python', 'VSCODE_GIT_ASKPASS_NODE': '~/.vscode-server/bin/d2e414d9e4239a252d1ab117bd7067f125afd80a/node', 'TERM_PROGRAM': 'vscode', 'SSH_CLIENT': '192.168.137.1 63214 22', 'TERM_PROGRAM_VERSION': '1.50.1', 'TMUX': '/tmp/tmux-1000/default,129144,0', 'XDG_DATA_DIRS': '~/.local/share/flatpak/exports/share:/var/lib/flatpak/exports/share:/usr/local/share:/usr/share', '_CE_CONDA': '', 'VSCODE_IPC_HOOK_CLI': '/run/user/1000/vscode-ipc-f567ba04-d1be-40de-9dbe-205a254be7c3.sock', 'CONDA_PROMPT_MODIFIER': '(phylophlan) ', 'MAIL': '/var/spool/mail/hwrn', 'VSCODE_GIT_ASKPASS_MAIN': '~/.vscode-server/bin/d2e414d9e4239a252d1ab117bd7067f125afd80a/extensions/git/dist/askpass-main.js', 'SHELL': '/bin/bash', 'TERM': 'screen', 'TMUX_PANE': '%0', 'SHLVL': '4', 'VSCODE_GIT_IPC_HANDLE': '/run/user/1000/vscode-git-723a0c4359.sock', 'LOGNAME': 'hwrn', 'DBUS_SESSION_BUS_ADDRESS': 'unix:path=/run/user/1000/bus', 'GIT_ASKPASS': '~/.vscode-server/bin/d2e414d9e4239a252d1ab117bd7067f125afd80a/extensions/git/dist/askpass.sh', 'XDG_RUNTIME_DIR': '/run/user/1000', 'PATH': '~/Software/anaconda3/envs/phylophlan/bin:~/.vscode-server/bin/d2e414d9e4239a252d1ab117bd7067f125afd80a/bin:~/Software/anaconda3/condabin:~/.local/bin:~/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:~/.dotnet/tools', 'CONDA_DEFAULT_ENV': 'phylophlan', 'HISTSIZE': '1000', 'LESSOPEN': '||/usr/bin/lesspipe.sh %s'}

[e] Command '['~/Software/anaconda3/envs/phylophlan/bin/FastTree', '-quiet', '-pseudo', '-spr', '4', '-mlacc', '2', '-slownni', '-fastest', '-no2nd', '-mlnni', '4', '-lg', '-out', '7_final_phylophlan/tmp/gene_tree1/p0197.tre', '7_final_phylophlan/tmp/sub/p0197.aln']' returned non-zero exit status 1.

[e] error while building gene tree
    {'program_name': '~/Software/anaconda3/envs/phylophlan/bin/FastTree', 'params': '-quiet -pseudo -spr 4 -mlacc 2 -slownni -fastest -no2nd -mlnni 4 -lg', 'output': '-out', 'command_line': '#program_name# #params# #output# #input#'}
    PROTCATLG
    7_final_phylophlan/tmp/sub/p0197.aln
    7_final_phylophlan/tmp/gene_tree1
    p0197.tre

[e] Command '['~/Software/anaconda3/envs/phylophlan/bin/FastTree', '-quiet', '-pseudo', '-spr', '4', '-mlacc', '2', '-slownni', '-fastest', '-no2nd', '-mlnni', '4', '-lg', '-out', '7_final_phylophlan/tmp/gene_tree1/p0197.tre', '7_final_phylophlan/tmp/sub/p0197.aln']' returned non-zero exit status 1.

[e] build_gene_tree crashed

Then I tried

FastTree -quiet -pseudo -spr 4 -mlacc 2 -slownni -fastest -no2nd -mlnni 4 -lg -out 7_final_phylophlan/tmp/gene_tree1/p0197.tre 7_final_phylophlan/tmp/sub/p0197.aln

stderr:

Non-unique name 'maxbin2_107.032' in the alignment

In 7_final_phylophlan/tmp/sub/p0197.aln (fisrt 20 line):

>76_sub
GTRLKMIFYLMMSGIPPGLAAEKDR
>55_sub22
SPERHEGHHLGHAPVPAGL------
>maxbin2_107.027_sub
GTRLKMIYYLMLAGIPPGIAAEKDR
>82_sub
GTRLKMIYYLMLAGIPPGLAAEKDR
>maxbin2_107.013_sub
GTRLKMIYYLMLAGIPPGLAAEKDR
>maxbin2_107.028_sub
GTRLKMVYYLMLAGIPPGLAAEKDR
>maxbin2_107.022
GARLKMIYYLMLAGIPPGLAAEKDR
>maxbin2_107.032
GTRLKMIYYLMMADIPPGLAAEKDR
>maxbin2_107.032
GARVREGHHLGHPTLAHYLRARHDR
>maxbin2_107.026
GTRLKMAYYLMLAGIPPGLAAEKDR

What should I do to solve this error? And can I know which file are generated by which command on external software?

Thanks for your program, and thanks for your advises!

fasnicar commented 4 years ago

Hi, thanks for reporting this. Can you please provide the ls of the input folder ../../F-06-MAG/03_modify/7_final/?

Many thanks, Francesco

Hocnonsense commented 4 years ago

Sure

]$ls -l ../../F-06-MAG/03_modify/7_final/
total 387108
-rw-rw-rw- 1 clsxx clsxx 4969949 Sep 22 15:58 17.fa
-rw-rw-rw- 1 clsxx clsxx 4574899 Sep 22 15:58 4.fa
-rw-rw-rw- 1 clsxx clsxx 6726168 Sep 26 17:18 55_sub22.fa
-rw-rw-rw- 1 clsxx clsxx 3588359 Sep 22 15:58 76_sub.fa
-rw-rw-rw- 1 clsxx clsxx 7843351 Sep 22 16:03 79_sub1.fa
-rw-rw-rw- 1 clsxx clsxx 3594839 Sep 22 15:58 82_sub.fa
-rw-rw-rw- 1 clsxx clsxx 4215534 Sep 22 16:03 91_sub1.fa
-rw-rw-rw- 1 clsxx clsxx 3207749 Sep 22 15:58 maxbin2_107.005.fa
-rw-rw-rw- 1 clsxx clsxx 4620946 Sep 22 15:58 maxbin2_107.013_sub.fa
-rw-rw-rw- 1 clsxx clsxx 3821485 Sep 22 15:57 maxbin2_107.022.fa
-rw-rw-rw- 1 clsxx clsxx 5871658 Sep 22 15:58 maxbin2_107.025.fa
-rw-rw-rw- 1 clsxx clsxx 4785490 Sep 22 15:58 maxbin2_107.026.fa
-rw-rw-rw- 1 clsxx clsxx 2820982 Sep 22 15:58 maxbin2_107.027_sub.fa
-rw-rw-rw- 1 clsxx clsxx 3753804 Sep 22 15:58 maxbin2_107.028_sub.fa
-rw-rw-rw- 1 clsxx clsxx 5922310 Sep 22 15:58 maxbin2_107.031.fa
-rw-rw-rw- 1 clsxx clsxx 2627202 Sep 22 15:58 maxbin2_107.032.fa
-rw-rw-rw- 1 clsxx clsxx 3339858 Sep 22 15:57 maxbin2_107.034.fa
-rw-rw-rw- 1 clsxx clsxx 2995234 Sep 22 15:58 maxbin2_107.037.fa
-rw-rw-rw- 1 clsxx clsxx 4174044 Sep 22 15:58 maxbin2_107.039.fa
-rw-rw-rw- 1 clsxx clsxx 4347604 Sep 22 15:58 maxbin2_107.041.fa
-rw-rw-rw- 1 clsxx clsxx 5140145 Sep 22 15:58 maxbin2_107.044_sub.fa
-rw-rw-rw- 1 clsxx clsxx 4192306 Sep 22 15:58 maxbin2_107.045.fa
-rw-rw-rw- 1 clsxx clsxx 6774352 Sep 22 15:58 maxbin2_107.046.fa
-rw-rw-rw- 1 clsxx clsxx 2701591 Sep 22 16:03 maxbin2_107.050_sub1.fa
-rw-rw-rw- 1 clsxx clsxx 7787833 Sep 22 15:58 maxbin2_107.053.fa
-rw-rw-rw- 1 clsxx clsxx 3968547 Sep 22 15:58 maxbin2_107.059.fa
-rw-rw-rw- 1 clsxx clsxx 2961056 Sep 22 15:58 maxbin2_107.064.fa
-rw-rw-rw- 1 clsxx clsxx 4050255 Sep 22 15:58 maxbin2_107.065_sub.fa
-rw-rw-rw- 1 clsxx clsxx 2832060 Sep 22 15:58 maxbin2_107.066.fa
-rw-rw-rw- 1 clsxx clsxx 3229901 Sep 26 20:21 maxbin2_107.068_sub111.fa
-rw-rw-rw- 1 clsxx clsxx 5625707 Sep 22 15:58 maxbin2_107.079.fa
-rw-rw-rw- 1 clsxx clsxx 6417312 Sep 22 15:58 maxbin2_107.081.fa
-rw-rw-rw- 1 clsxx clsxx 5030038 Sep 22 15:58 maxbin2_107.083.fa
-rw-rw-rw- 1 clsxx clsxx 5419043 Sep 22 15:58 maxbin2_107.086.fa
-rw-rw-rw- 1 clsxx clsxx 4474221 Sep 22 15:58 maxbin2_107.088.fa
-rw-rw-rw- 1 clsxx clsxx 6967323 Sep 22 15:58 maxbin2_107.090.fa
-rw-rw-rw- 1 clsxx clsxx 6306084 Sep 22 15:58 maxbin2_107.091.fa
-rw-rw-rw- 1 clsxx clsxx 4945034 Sep 22 15:58 maxbin2_107.092.fa
-rw-rw-rw- 1 clsxx clsxx 4079006 Sep 22 15:58 maxbin2_107.093_sub.fa
-rw-rw-rw- 1 clsxx clsxx 3956726 Sep 22 15:58 maxbin2_107.097.fa
-rw-rw-rw- 1 clsxx clsxx 3174916 Sep 22 15:57 maxbin2_107.104_sub.fa
-rw-rw-rw- 1 clsxx clsxx 2735013 Sep 22 15:57 maxbin2_107.113_sub.fa
-rw-rw-rw- 1 clsxx clsxx 5600278 Sep 26 20:21 maxbin2_107.115_sub31.fa
-rw-rw-rw- 1 clsxx clsxx 5439882 Sep 22 15:58 maxbin2_107.116.fa
-rw-rw-rw- 1 clsxx clsxx 4313883 Sep 22 15:58 maxbin2_107.118.fa
-rw-rw-rw- 1 clsxx clsxx 4449332 Sep 22 15:57 maxbin2_107.120.fa
-rw-rw-rw- 1 clsxx clsxx 5251748 Sep 22 15:58 maxbin2_107.122.fa
-rw-rw-rw- 1 clsxx clsxx 3147121 Sep 26 17:18 maxbin2_107.124_sub11.fa
-rw-rw-rw- 1 clsxx clsxx 2542677 Sep 22 15:58 maxbin2_107.126_sub.fa
-rw-rw-rw- 1 clsxx clsxx 4552473 Sep 22 15:57 maxbin2_107.127.fa
-rw-rw-rw- 1 clsxx clsxx 6726549 Sep 22 15:57 metabat2_60_60.103.fa
-rw-rw-rw- 1 clsxx clsxx 4250290 Sep 22 15:58 metabat2_60_60.120.fa
-rw-rw-rw- 1 clsxx clsxx 7551529 Sep 22 15:58 metabat2_60_60.151.fa
-rw-rw-rw- 1 clsxx clsxx 5187822 Sep 22 15:58 metabat2_60_60.161.fa
-rw-rw-rw- 1 clsxx clsxx 6682918 Sep 22 15:57 metabat2_60_60.34.fa
-rw-rw-rw- 1 clsxx clsxx 4119739 Sep 22 15:58 metabat2_60_60.41.fa
-rw-rw-rw- 1 clsxx clsxx 2682118 Sep 22 15:58 metabat2_60_60.43.fa
-rw-rw-rw- 1 clsxx clsxx 5119635 Sep 22 15:58 metabat2_60_60.45.fa
-rw-rw-rw- 1 clsxx clsxx 6553814 Sep 22 15:58 metabat2_60_60.73.fa
-rw-rw-rw- 1 clsxx clsxx 3014942 Sep 22 15:58 metabat2_60_60.97.fa
-rw-rw-rw- 1 clsxx clsxx 5161534 Sep 22 15:58 metabat2_60_75.102_sub.fa
-rw-rw-rw- 1 clsxx clsxx 3275083 Sep 22 15:58 metabat2_60_75.112.fa
-rw-rw-rw- 1 clsxx clsxx 5077128 Sep 22 15:57 metabat2_60_75.24.fa
-rw-rw-rw- 1 clsxx clsxx 6127938 Sep 22 15:58 metabat2_60_75.77.fa
-rw-rw-rw- 1 clsxx clsxx 3205613 Sep 22 15:58 metabat2_60_75.86.fa
-rw-rw-rw- 1 clsxx clsxx 3947157 Sep 22 15:57 metabat2_60_75.99.fa
-rw-rw-rw- 1 clsxx clsxx 5109839 Sep 22 15:58 metabat2_60_90.10.fa
-rw-rw-rw- 1 clsxx clsxx 3792995 Sep 22 15:58 metabat2_60_90.114.fa
-rw-rw-rw- 1 clsxx clsxx 3170997 Sep 22 15:58 metabat2_60_90.117.fa
-rw-rw-rw- 1 clsxx clsxx 4130257 Sep 22 15:57 metabat2_60_90.14.fa
-rw-rw-rw- 1 clsxx clsxx 2949459 Sep 22 15:58 metabat2_60_90.154.fa
-rw-rw-rw- 1 clsxx clsxx 3073912 Sep 22 15:58 metabat2_60_90.162.fa
-rw-rw-rw- 1 clsxx clsxx 2933491 Sep 22 15:57 metabat2_60_90.16.fa
-rw-rw-rw- 1 clsxx clsxx 4019431 Sep 22 15:58 metabat2_60_90.17.fa
-rw-rw-rw- 1 clsxx clsxx 5494189 Sep 22 15:57 metabat2_60_90.4.fa
-rw-rw-rw- 1 clsxx clsxx 3426629 Sep 22 15:58 metabat2_60_90.54.fa
-rw-rw-rw- 1 clsxx clsxx 4705852 Sep 22 15:58 metabat2_60_90.68.fa
-rw-rw-rw- 1 clsxx clsxx 4686060 Sep 22 15:58 metabat2_60_90.73.fa
-rw-rw-rw- 1 clsxx clsxx 3705816 Sep 22 15:57 metabat2_60_90.90.fa
-rw-rw-rw- 1 clsxx clsxx 1857224 Sep 22 15:58 metabat2_75_60.68.fa
-rw-rw-rw- 1 clsxx clsxx 4968525 Sep 22 15:57 metabat2_75_90.107.fa
-rw-rw-rw- 1 clsxx clsxx 4190041 Sep 22 15:58 metabat2_75_90.121_sub.fa
-rw-rw-rw- 1 clsxx clsxx 3210737 Sep 22 15:58 metabat2_75_90.150.fa
-rw-rw-rw- 1 clsxx clsxx 5440898 Sep 22 15:58 metabat2_90_60.124.fa
-rw-rw-rw- 1 clsxx clsxx 6539457 Sep 22 15:58 metabat2_90_75.111.fa
-rw-rw-rw- 1 clsxx clsxx 4713207 Sep 22 15:57 metabat2_90_90.14.fa
-rw-rw-rw- 1 clsxx clsxx 3786599 Sep 22 15:58 metabat2_90_90.175.fa
-rw-rw-rw- 1 clsxx clsxx 3190771 Sep 22 15:58 metabat2_90_90.197.fa
-rw-rw-rw- 1 clsxx clsxx 2558451 Sep 22 15:57 metabat2_90_90.5.fa

These .fa files are MAGs binned from metagenome (reads -> assembly -> map & binning with metabat and DASTool, all the MAGs are qualifed by CheckM with completeness >= 70%, contaminant <= 10%), I think it may caused by multi marker genes in single .fa file.

fasnicar commented 4 years ago

Many thanks! Multiple contigs in the same fast file should not be a problem. Can you please report the PhyloPhlAn version? Also, if you can provide me with the input folder I can do some more debugging from my side.

Many thanks, Francesco

Hocnonsense commented 4 years ago

Sure. I tried a smaller subset of 6 bins. However, same error also occurs. My command:

phylophlan \
    -i test \
    -d phylophlan --diversity high -f supertree_aa.cfg \
    --genome_extension .fa \
    --nproc 5 \
    --maas /home/hwrn/Software/anaconda3/envs/phylophlan/lib/python3.9/site-packages/phylophlan/phylophlan_substitution_models/phylophlan.tsv \
    --verbose\
    > test1.log 2>&1

zipped file can be download here

Thanks!

fasnicar commented 4 years ago

Hi and thanks! I did some debugging and fixed the gene tree pipeline (available with the latest commit 804648deced5ede3d0a9286a20d93a9df5850a02). The package in Bioconda is not yet available, so if you could get PhyloPhlAn directly from the repository and test with your larger set of inputs, that would be great.

Many thanks, Francesco

Hocnonsense commented 4 years ago

Many thanks, it ran prefectly!