Dee-chen / Tree2gd

GNU General Public License v3.0
34 stars 7 forks source link

ERROR at Step4 #7

Closed PRTGRWL closed 1 year ago

PRTGRWL commented 1 year ago

I am using Tree2GD:A pipeline for WGD V1.0.41 version. Test data run well but in my experimental data I am facing an issue at step4 and step2. Due to segmentation fault files at step 2 not generated and thereby not available for processing at step4.

At step 4 Error: Traceback (most recent call last): File "/home/preeti.agarwal/.local/bin/Tree2gd", line 8, in sys.exit(main()) File "/home/preeti.agarwal/.local/lib/python3.8/site-packages/tree2gd_main.py", line 236, in main run_tree2gd(step1out,args,cf,step2out,step4out) File "/home/preeti.agarwal/.local/lib/python3.8/site-packages/tree2gd/wgd.py", line 118, in run_tree2gd fa_list=mcl2fasta(int(minimal_taxa),os.sep.join([step4out,'all_fa/']),step2out,step1out,args) File "/home/preeti.agarwal/.local/lib/python3.8/site-packages/tree2gd/wgd.py", line 24, in mcl2fasta with open(os.sep.join([step2out,'allmcl.out.OGs.group']),"rU") as infile: FileNotFoundError: [Errno 2] No such file or directory: '/home/preeti.agarwal/monkey_pox/monkey_pox_ssr/tools_used/tree2gd/TREE_final_run/output_dir/step2.MCL/allmcl.out.OGs.group' ~

Then, I checked logfile at step 2 and found segmentation fault

Error: 11:40:39: INFO The Step2 output dir :/lustre/preeti.agarwal/again_t2g_run/output_dir/step2.MCL does not exist,will create it. 11:40:39: INFO Writing step2 script files... 11:40:39: INFO Start runing step2 MCL: /lustre/preeti.agarwal/again_t2g_run/output_dir/step2.MCL/step2.sh.. 11:41:45: DEBUG step2.sh stdout: 11:41:45: DEBUG step2.sh stderr: /lustre/preeti.agarwal/again_t2g_run/output_dir/step2.MCL/step2.sh: line 1: 47979 Segmentation fault /home/preeti.agarwal/.local/lib/python3.8/site-packages/tree2gd/software/PhyloMCL -in /lustre/preeti.agarwal/again_t2g_run/output_dir/step1.blastp/all_blastp.out -threads 30 -length /lustre/preeti.agarwal/again_t2g_run/output_dir/step1.blastp/all_sample.faa.length -species /lustre/preeti.agarwal/again_t2g_run/output_dir/step1.blastp/all_sample2fa.list -tree /lustre/preeti.agarwal/again_t2g_run/output_dir/step2.MCL/phymcl.input.tree -out /lustre/preeti.agarwal/again_t2g_run/output_dir/step2.MCL/allmcl.out 11:41:45: INFO step2 mcl has done. 11:41:45: DEBUG step2.sh stdout: 11:41:45: DEBUG step2.sh stderr: mcl file : /lustre/preeti.agarwal/again_t2g_run/output_dir/step2.MCL/allmcl.out.OGs.group The file /lustre/preeti.agarwal/again_t2g_run/output_dir/step2.MCL/allmcl.out.OGs.group does not exist

Dee-chen commented 1 year ago

Hello, sorry for the delay in replying to your message. According to your error result, it was caused by the failure of gene family clustering using phylomcl in the second step. I recommend that you run step2.sh separately and check phylomcl's log files to further determine the problem. In my experience with the analysis, this problem is often caused by phylomcl's strict requirements for gene names in fasta files, with special characters that prevent it from recognizing the sequence correctly.

PRTGRWL commented 1 year ago

Hi.. I will try to do second step separately and will update it.