Dee-chen / Tree2gd

GNU General Public License v3.0
34 stars 7 forks source link

running erro in test #5

Open lfp-a opened 1 year ago

lfp-a commented 1 year ago

2022-10-26 20:22:43: INFO ALL step1 blastp has done. 2022-10-26 20:22:43: INFO Start sorting results.. 2022-10-26 20:22:43: INFO pep all2all diamond done. 2022-10-26 20:22:43: INFO The Step2 output dir :/data/wanglab/liufangpu/test/Tree2gd_test_out/step2.MCL does not exist,will create it. 2022-10-26 20:22:43: INFO Writing step2 script files... 2022-10-26 20:22:43: INFO Start runing step2 MCL: /data/wanglab/liufangpu/test/Tree2gd_test_out/step2.MCL/step2.sh.. 2022-10-26 20:22:44: INFO step2 mcl has done. 2022-10-26 20:22:44: INFO The Step3 output dir :/data/wanglab/liufangpu/test/Tree2gd_test_out/step3.dollop does not exist,will create it. 2022-10-26 20:22:44: INFO Writing step3 script files... 2022-10-26 20:22:44: INFO Start runing step3 dollop: /data/wanglab/liufangpu/test/Tree2gd_test_out/step3.dollop/step3.sh.. 2022-10-26 20:22:44: INFO step3 dollop has done. 2022-10-26 20:22:44: INFO The Step4 output dir :/data/wanglab/liufangpu/test/Tree2gd_test_out/step4.WGD does not exist,will create it. 2022-10-26 20:22:44: INFO Writing step4 script files... 2022-10-26 20:22:44: INFO Reading mcl output file 2022-10-26 20:22:44: INFO gene_idmap read done. 2022-10-26 20:22:44: INFO clusters with at least 4 taxa read 2022-10-26 20:22:44: INFO Reading the fasta file /data/wanglab/liufangpu/test/Tree2gd_test_out/step1.blastp/all_sample.pep.faa 2022-10-26 20:22:44: INFO FASTA file selection has been completed ,according to MCL results. 2022-10-26 20:22:44: INFO Start runing step4 fa2tree. 2022-10-26 20:22:50: INFO step4 fa2tree has done. 2022-10-26 20:22:50: INFO All sequences have completed the construction of the tree. 2022-10-26 20:22:50: INFO Start WGD calculation. 2022-10-26 20:22:50: INFO step4 Tree2GD has done. 2022-10-26 20:22:50: INFO The Step5 output dir :/data/wanglab/liufangpu/test/Tree2gd_test_out/step5.KaKs does not exist,will create it. 2022-10-26 20:22:50: INFO Writing step5 script files... 2022-10-26 20:22:50: INFO Start KaKs calculation. 2022-10-26 20:22:50: INFO Readfile 2022-10-26 20:22:50: INFO Make dir 2022-10-26 20:22:50: INFO Start gene pairs kaks calculation. 2022-10-26 20:22:50: INFO Gene pairs kaks calculation finish. 2022-10-26 20:22:50: INFO Start summarize the Ks results of each species. 2022-10-26 20:22:50: INFO ALL step5 kaks has done. 2022-10-26 20:22:50: INFO The Step6 output dir :/data/wanglab/liufangpu/test/Tree2gd_test_out/step6.plot_summary does not exist,will create it. 2022-10-26 20:22:51: INFO Start step6 summary plot... 2022-10-26 20:22:51: INFO R plot done. 2022-10-26 20:22:51: INFO Start html plot ... Traceback (most recent call last): File "/data/wanglab/liufangpu/miniconda3/envs/tree2gd/bin/Tree2gd", line 8, in sys.exit(main()) File "/data/wanglab/liufangpu/miniconda3/envs/tree2gd/lib/python3.8/site-packages/tree2gd_main.py", line 294, in main run_plot(step6out,args,cf) File "/data/wanglab/liufangpu/miniconda3/envs/tree2gd/lib/python3.8/site-packages/tree2gd/plot.py", line 62, in run_plot file_names = os.listdir("html_plot_in") FileNotFoundError: [Errno 2] No such file or directory: 'html_plot_in'

I encountered this error when using test data,the test command is

Tree2gd_test -t 4 --config config.ini

I don't know how to solve this problem. I need your help. Thank you

Dee-chen commented 1 year ago

Thank you for your feedback. According to the error content, it should be caused by the dependent package of the drawing not being installed. Tree2gd automatically downloads and installs several R dependency packages in step 6 during the first test. The solution is: First, make sure your server can connect to the Internet. Then, run the result summary R program manually (according to your log, it should be "/data/wanglab/liufangpu/test/Tree2gd_test_out/step6. plot_summary/Tree2GD_draw. R") to check whether there are dependent package installation failures or other errors. If it is a package dependent problem, you can try to replace the download mirrors or customize the installation. If there are other errors, you can continue to feed back to me.

lfp-a commented 1 year ago

Thank you very much for your help I run Tree2GD_draw.R and it tells me that I don't have biocmanager installed, I ran it again and he gave me an error like this

testing if installed package can be loaded from temporary location testing if installed package can be loaded from final location ** testing if installed package keeps a record of temporary installation path

The downloaded source packages are in ‘/tmp/RtmpdNydhT/downloaded_packages’ Updating HTML index of packages in '.Library' Making 'packages.html' ... done ggtree v3.4.4 For help: https://yulab-smu.top/treedata-book/

If you use the ggtree package suite in published research, please cite the appropriate paper(s):

Guangchuang Yu, David Smith, Huachen Zhu, Yi Guan, Tommy Tsan-Yuk Lam. ggtree: an R package for visualization and annotation of phylogenetic trees with their covariates and other associated data. Methods in Ecology and Evolution. 2017, 8(1):28-36. doi:10.1111/2041-210X.12628

LG Wang, TTY Lam, S Xu, Z Dai, L Zhou, T Feng, P Guo, CW Dunn, BR Jones, T Bradley, H Zhu, Y Guan, Y Jiang, G Yu. treeio: an R package for phylogenetic tree input and output with richly annotated and associated data. Molecular Biology and Evolution. 2020, 37(2):599-603. doi: 10.1093/molbev/msz240

Guangchuang Yu. Using ggtree to visualize data on tree-like structures. Current Protocols in Bioinformatics. 2020, 69:e96. doi:10.1002/cpbi.96

Error in $<-.data.frame(*tmp*, describe, value = " %") : replacement has 1 row, data has 0 Calls: $<- -> $<-.data.frame Execution halted

After I check the output directory, I find the error is in step4, Tree2gd_test_out/ step4.wgd /iqtree_out directory is empty, after I run step4.sh, it appears the following error

MUSCLE v3.8.31 by Robert C. Edgar

http://www.drive5.com/muscle This software is donated to the public domain. Please cite: Edgar, R.C. Nucleic Acids Res 32(5), 1792-97.

ERROR Cannot open '/data/wanglab/liufangpu/test/Tree2gd_test_out/step4.WGD/all_fa/*.fa' errno=2

Invalid "-B" option. --species=: 2 --bp=: 50 --sub_bp=: 0 --split_tree=: false --quick_file=: --parser_file=: --paml=: false --omega=: --genome=: --isoform=: --save_tree=: true --deepvar=: 1 --root=: MAX_MIX load tree list: /data/wanglab/liufangpu/test/Tree2gd_test_out/step4.WGD/tree.list size=0 load idmap: /data/wanglab/liufangpu/test/Tree2gd_test_out/step1.blastp/all_sample2fa.list 827 genes loaded for 7 species

phyto_5 (index=12, phy=0) Cpapaya (index=0, phy=0) phyto_4 (index=11, phy=0) phyto_1 (index=5, phy=0) | phyto_0 (index=3, phy=0) | | Athaliana (index=1, phy=0) | | Alyrata (index=2, phy=0) | Crubella (index=4, phy=0) phyto_3 (index=10, phy=0) Esalsugineum (index=6, phy=0) phyto_2 (index=9, phy=0) Brapa (index=7, phy=0) Tparvula (index=8, phy=0) outgroup: Cpapaya loading 0 trees set index for 0 trees identifying gene duplications validate and output files -> /data/wanglab/liufangpu/test/Tree2gd_test_out/step4.WGD/Tree2GD_out//summary.txt -> /data/wanglab/liufangpu/test/Tree2gd_test_out/step4.WGD/Tree2GD_out//low_copy_orthologs.info /data/wanglab/liufangpu/test/Tree2gd_test_out/step4.WGD/Tree2GD_out//low_copy_orthologs.quick -> /data/wanglab/liufangpu/test/Tree2gd_test_out/step4.WGD/Tree2GD_out//gd.gene_pairs.txt -> /data/wanglab/liufangpu/test/Tree2gd_test_out/step4.WGD/Tree2GD_out//ortholog.gene_pairs.txt -> /data/wanglab/liufangpu/test/Tree2gd_test_out/step4.WGD/Tree2GD_out//gd.single_gd_pattern.txt -> /data/wanglab/liufangpu/test/Tree2gd_test_out/step4.WGD/Tree2GD_out//gd.single_gd_lineage.txt -> /data/wanglab/liufangpu/test/Tree2gd_test_out/step4.WGD/Tree2GD_out//gd.ancestral_gd_retention.txt -> /data/wanglab/liufangpu/test/Tree2gd_test_out/step4.WGD/Tree2GD_out//gd.recent_gd_retention.txt -> /data/wanglab/liufangpu/test/Tree2gd_test_out/step4.WGD/Tree2GD_out//gd.median_ks.txt -> /data/wanglab/liufangpu/test/Tree2gd_test_out/step4.WGD/Tree2GD_out//GDtype_stat.txt -> /data/wanglab/liufangpu/test/Tree2gd_test_out/step4.WGD/Tree2GD_out//summarytable.txt -> /data/wanglab/liufangpu/test/Tree2gd_test_out/step4.WGD/Tree2GD_out//Phtree.nwk -> rooted trees -> /data/wanglab/liufangpu/test/Tree2gd_test_out/step4.WGD/Tree2GD_out//clusters.ancestors.txt -> isoform_candidates -> /data/wanglab/liufangpu/test/Tree2gd_test_out/step4.WGD/Tree2GD_out//sister_clades.txt

Thank you again for your help

Dee-chen commented 1 year ago

After checking and testing, in the latest V1.0.41 version, due to my inadvertent mistake in packaging the version of IQtree1 instead of IQtree2 used by Tree2gd, there was an error with the -B and -b parameters. I'm very sorry about that. Now I have already upload in making correct IQtree version (https://github.com/Dee-chen/Tree2gd/tree/master/tree2gd/software/iqtree), You can download and replace / [YOUR_PATH] / python3.9 / site - packages/tree2gd/software/iqtree, so as to solve the problem quickly.

I'll be fixing the bugs I've collected so far in the near future, uploading a full new version of Tree2gd in a few days, and then you'll be able to use Tree2gd normally through pypi updates.

Again, I'm sorry for causing you trouble with Tree2gd analysis

lfp-a commented 1 year ago

Thank you very much for your help, I have changed the iqtree file, but a new problem has arisen

`2022-10-31 16:18:32: INFO Writing step2 script files... 2022-10-31 16:18:32: INFO Start runing step2 MCL: /data/wanglab/liufangpu/soft/Tree2gd-master/Tree2gd_test_out/step2.MCL/step2.sh.. 2022-10-31 16:18:32: INFO step2 mcl has done. 2022-10-31 16:18:32: INFO The Step3 output dir :/data/wanglab/liufangpu/soft/Tree2gd-master/Tree2gd_test_out/step3.dollop does not exist,will create it. 2022-10-31 16:18:32: INFO Writing step3 script files... 2022-10-31 16:18:32: INFO Start runing step3 dollop: /data/wanglab/liufangpu/soft/Tree2gd-master/Tree2gd_test_out/step3.dollop/step3.sh.. 2022-10-31 16:18:32: INFO step3 dollop has done. 2022-10-31 16:18:32: INFO The Step4 output dir :/data/wanglab/liufangpu/soft/Tree2gd-master/Tree2gd_test_out/step4.WGD does not exist,will create it. 2022-10-31 16:18:32: INFO Writing step4 script files... 2022-10-31 16:18:32: INFO Reading mcl output file 2022-10-31 16:18:32: INFO gene_idmap read done. 2022-10-31 16:18:32: INFO clusters with at least 4 taxa read 2022-10-31 16:18:32: INFO Reading the fasta file /data/wanglab/liufangpu/soft/Tree2gd-master/Tree2gd_test_out/step1.blastp/all_sample.pep.faa 2022-10-31 16:18:32: INFO FASTA file selection has been completed ,according to MCL results. 2022-10-31 16:18:32: INFO Start runing step4 fa2tree. 2022-10-31 16:20:02: INFO step4 fa2tree has done. 2022-10-31 16:20:02: INFO All sequences have completed the construction of the tree. 2022-10-31 16:20:02: INFO Start WGD calculation. 2022-10-31 16:20:02: INFO step4 Tree2GD has done. 2022-10-31 16:20:02: INFO The Step5 output dir :/data/wanglab/liufangpu/soft/Tree2gd-master/Tree2gd_test_out/step5.KaKs does not exist,will create it. 2022-10-31 16:20:02: INFO Writing step5 script files... 2022-10-31 16:20:02: INFO Start KaKs calculation. 2022-10-31 16:20:02: INFO Readfile 2022-10-31 16:20:02: INFO Make dir 2022-10-31 16:20:02: INFO Start gene pairs kaks calculation. multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/data/wanglab/liufangpu/miniconda3/envs/tree2gd/lib/python3.9/multiprocessing/pool.py", line 125, in worker result = (True, func(*args, *kwds)) File "/data/wanglab/liufangpu/miniconda3/envs/tree2gd/lib/python3.9/multiprocessing/pool.py", line 48, in mapstar return list(map(args)) File "/home/liufangpu/.local/lib/python3.9/site-packages/tree2gd/kaks.py", line 84, in run_ks sub_sh(args[0],args[1],args[2],args[3],args[4],args[5],args[6]) File "/home/liufangpu/.local/lib/python3.9/site-packages/tree2gd/kaks.py", line 123, in sub_sh Fasta2AXT(pair[2]+"-"+pair[3]+".filted.cds_aln",pair[2]+"-"+pair[3]+".cds_aln.axt") File "/home/liufangpu/.local/lib/python3.9/site-packages/tree2gd/kaks.py", line 137, in Fasta2AXT for s in read_fasta_file(input): File "/home/liufangpu/.local/lib/python3.9/site-packages/tree2gd/seq.py", line 63, in read_fasta_file fl = open(filename,"r") FileNotFoundError: [Errno 2] No such file or directory: 'Carub.0004s0710.1.p-Carub.0004s2151.1.p.filted.cds_aln' """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/data/wanglab/liufangpu/miniconda3/envs/tree2gd/bin/Tree2gd", line 33, in sys.exit(load_entry_point('Tree2gd==1.0.38', 'console_scripts', 'Tree2gd')()) File "/home/liufangpu/.local/lib/python3.9/site-packages/tree2gd_main.py", line 268, in main run_kaks(sp_list,step1out,args,cf,step4out,step5out,gene_pairs_idmap) File "/home/liufangpu/.local/lib/python3.9/site-packages/tree2gd/kaks.py", line 63, in run_kaks sh_pool.map(run_ks,arg_list) File "/data/wanglab/liufangpu/miniconda3/envs/tree2gd/lib/python3.9/multiprocessing/pool.py", line 364, in map return self._map_async(func, iterable, mapstar, chunksize).get() File "/data/wanglab/liufangpu/miniconda3/envs/tree2gd/lib/python3.9/multiprocessing/pool.py", line 771, in get raise self._value This looks like a fourth step problem, and then I run the fourth step “step4.sh”alone, He has the following error MUSCLE v3.8.31 by Robert C. Edgar

http://www.drive5.com/muscle This software is donated to the public domain. Please cite: Edgar, R.C. Nucleic Acids Res 32(5), 1792-97.

ERROR Cannot open '/data/wanglab/liufangpu/soft/Tree2gd-master/Tree2gd_test_out/step4.WGD/all_fa/*.fa' errno=2

IQ-TREE multicore version 2.1.2 COVID-edition for Linux 64-bit built Oct 22 2020 Developed by Bui Quang Minh, James Barbetti, Nguyen Lam Tung, Olga Chernomor, Heiko Schmidt, Dominik Schrempf, Michael Woodhams.

Host: DongLab (AVX512, FMA3, 503 GB RAM) Command: /home/liufangpu/.local/lib/python3.9/site-packages/tree2gd/software/iqtree -B 1000 -m JTT+G4 -s /data/wanglab/liufangpu/soft/Tree2gd-master/Tree2gd_test_out/step4.WGD/muscle_out/.aln -pre /data/wanglab/liufangpu/soft/Tree2gd-master/Tree2gd_test_out/step4.WGD/iqtree_out/.aln.iqtree Seed: 635505 (Using SPRNG - Scalable Parallel Random Number Generator) Time: Mon Oct 31 16:25:21 2022 Kernel: AVX+FMA - 1 threads (160 CPU cores detected)

HINT: Use -nt option to specify number of threads because your CPU has 160 cores! HINT: -nt AUTO will automatically determine the best number of threads to use.

Reading alignment file /data/wanglab/liufangpu/soft/Tree2gd-master/Tree2gd_test_out/step4.WGD/muscle_out/.aln ... ERROR: File not found /data/wanglab/liufangpu/soft/Tree2gd-master/Tree2gd_test_out/step4.WGD/muscle_out/.aln --species=: 2 --bp=: 50 --sub_bp=: 0 --split_tree=: false --quick_file=: --parser_file=: --paml=: false --omega=: --genome=: --isoform=: --save_tree=: true --deepvar=: 1 --root=: MAX_MIX load tree list: /data/wanglab/liufangpu/soft/Tree2gd-master/Tree2gd_test_out/step4.WGD/tree.list size=99 load idmap: /data/wanglab/liufangpu/soft/Tree2gd-master/Tree2gd_test_out/step1.blastp/all_sample2fa.list 827 genes loaded for 7 species

phyto_5 (index=12, phy=0) Cpapaya (index=0, phy=0) phyto_4 (index=11, phy=0) phyto_1 (index=5, phy=0) | phyto_0 (index=3, phy=0) | | Athaliana (index=1, phy=0) | | Alyrata (index=2, phy=0) | Crubella (index=4, phy=0) phyto_3 (index=10, phy=0) Esalsugineum (index=6, phy=0) phyto_2 (index=9, phy=0) Brapa (index=7, phy=0) Tparvula (index=8, phy=0) outgroup: Cpapaya loading 99 trees The file cluster10.iqtree.contree does not exist! `

Dee-chen commented 1 year ago

Sorry for the late reply. The error you reported should be due to the failure to install trimal. With last week's V1.0.43 update, I've included the pre-compiled version with the software. Please update and try again.

PS. Because of the large number of kaks gene pairs calculated in this step, in Tree2gd I use multithreading for synchronous computation. The loop traversal mode in step4.sh has been invalid and cannot be used. Thank you for your reminding. I will continue to optimize this part in the next version.

ardy20 commented 1 year ago

Dear All

Could you please update this section how to solve the error:

FileNotFoundError: [Errno 2] No such file or directory: 'html_plot_in'

zuodabin commented 1 year ago

Sorry for the late reply. The error you reported should be due to the failure to install trimal. With last week's V1.0.43 update, I've included the pre-compiled version with the software. Please update and try again.

PS. Because of the large number of kaks gene pairs calculated in this step, in Tree2gd I use multithreading for synchronous computation. The loop traversal mode in step4.sh has been invalid and cannot be used. Thank you for your reminding. I will continue to optimize this part in the next version.

thanks for your software,I encounter the same problem. I'm looking forward to your new version!!!!!!!!!!!!