Dee-chen / Tree2gd

GNU General Public License v3.0
34 stars 7 forks source link

running error during step5.KaKs #4

Open myBioFun opened 2 years ago

myBioFun commented 2 years ago

multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/software/miniconda3/lib/python3.7/multiprocessing/pool.py", line 121, in worker result = (True, func(*args, *kwds)) File "/software/miniconda3/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar return list(map(args)) File "/software/miniconda3/lib/python3.7/site-packages/tree2gd/kaks.py", line 82, in run_ks sub_sh(args[0],args[1],args[2],args[3],args[4],args[5],args[6]) File "/software/miniconda3/lib/python3.7/site-packages/tree2gd/kaks.py", line 119, in sub_sh axt2oneline((pair[2]+"-"+pair[3]+".cds_aln.axt"),(pair[2]+"-"+pair[3]+".one-line")) File "/software/miniconda3/lib/python3.7/site-packages/tree2gd/kaks.py", line 152, in axt2oneline mydict[header].append(line.strip()) UnboundLocalError: local variable 'header' referenced before assignment """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/software/miniconda3/bin/Tree2gd", line 33, in sys.exit(load_entry_point('Tree2gd==1.0.38', 'console_scripts', 'Tree2gd')()) File "/software/miniconda3/lib/python3.7/site-packages/tree2gd_main.py", line 267, in main run_kaks(sp_list,step1out,args,cf,step4out,step5out,gene_pairs_idmap) File "/software/miniconda3/lib/python3.7/site-packages/tree2gd/kaks.py", line 63, in run_kaks sh_pool.map(run_ks,arg_list) File "/software/miniconda3/lib/python3.7/multiprocessing/pool.py", line 268, in map return self._map_async(func, iterable, mapstar, chunksize).get() File "/software/miniconda3/lib/python3.7/multiprocessing/pool.py", line 657, in get raise self._value

ewcro commented 1 year ago

I also get an multiprocessing.pool.RemoteTraceback error at the same step:

""" multiprocessing.pool.RemoteTraceback: Traceback (most recent call last): File "/home/ewcro/anaconda3/lib/python3.9/multiprocessing/pool.py", line 125, in worker result = (True, func(*args, *kwds)) File "/home/ewcro/anaconda3/lib/python3.9/multiprocessing/pool.py", line 48, in mapstar return list(map(args)) File "/home/ewcro/anaconda3/lib/python3.9/site-packages/tree2gd/kaks.py", line 84, in run_ks sub_sh(args[0],args[1],args[2],args[3],args[4],args[5],args[6]) File "/home/ewcro/anaconda3/lib/python3.9/site-packages/tree2gd/kaks.py", line 123, in sub_sh Fasta2AXT(pair[2]+"-"+pair[3]+".filted.cds_aln",pair[2]+"-"+pair[3]+".cds_aln.axt") File "/home/ewcro/anaconda3/lib/python3.9/site-packages/tree2gd/kaks.py", line 137, in Fasta2AXT for s in read_fasta_file(input): File "/home/ewcro/anaconda3/lib/python3.9/site-packages/tree2gd/seq.py", line 63, in read_fasta_file fl = open(filename,"r") FileNotFoundError: [Errno 2] No such file or directory: 'evm.model.supercontig_85.105-evm.model.supercontig_85.106.filted.cds_aln' """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/ewcro/anaconda3/bin/Tree2gd", line 8, in sys.exit(main()) File "/home/ewcro/anaconda3/lib/python3.9/site-packages/tree2gd_main.py", line 268, in main run_kaks(sp_list,step1out,args,cf,step4out,step5out,gene_pairs_idmap) File "/home/ewcro/anaconda3/lib/python3.9/site-packages/tree2gd/kaks.py", line 63, in run_kaks sh_pool.map(run_ks,arg_list) File "/home/ewcro/anaconda3/lib/python3.9/multiprocessing/pool.py", line 364, in map return self._map_async(func, iterable, mapstar, chunksize).get() File "/home/ewcro/anaconda3/lib/python3.9/multiprocessing/pool.py", line 771, in get raise self._value FileNotFoundError: [Errno 2] No such file or directory: 'evm.model.supercontig_85.105-evm.model.supercontig_85.106.filted.cds_aln'

How can I solve this?

Dee-chen commented 1 year ago

@ewcro According to previous experience addressing the issue by myBioFunlems. The reason for the above problems, in general, is the inability to convert due to incomplete concordance between the partial gene CDS sequence you input and the protein sequence. This problem has been basically fixed in version 1.0.41 (updated on pypi and just packaged on GitHub).

So I suggest you try to repair it in two directions:

  1. Check the updated version of Tree2gd,and re run step 5 separately to try to repair. The new version will skip and record these wrong gene pairs, so as to ensure that all gene pairs Kaks are calculated.
  2. Manually check the wrong gene pair cds sequence and protein sequence(file:'evm.model.supercontig_85.105-evm.model.supercontig_85.106.filted.cds’ and 'evm.model.supercontig_85.105-evm.model.supercontig_85.106.filted.pep’), and check whether the translation results correspond. Based on previous experience, inconsistencies are often caused by termination codons.

Thank you. If it still can't be repaired, please reply me again. I will try to find other possible problems.

ewcro commented 1 year ago

Hello, Thank you for the quick response. I already have version 1.0.41. I got this error for the Tree2gd_test example. Should this test example not be free from inconsistencies? I can't find the filted files, where can I find them? Here are the files without the filted:

(base) cat evm.model.supercontig_85.105-evm.model.supercontig_85.106.cds

evm.model.supercontig_85.105 ATGACAATGGTTGATGTGCAAAACTGCACTATACTCTTCTTGGCTTGGCTAGCCTCCATTGCCTTGTTTCGAACCATCTGGACCAAGCTGAAGGCGAGGGCTCAACTTCCACCTGGTCCGCGACCCCTACCCATCATCGGAAACCAACATCTCCTCCGCCCAATAGCTCATCAAGCTCTTCACAAACTATCCCTAAACTATGGACCCTTGATGTGTCTCTTCTTTGGCTCAAAACCTTGTGTTGTTGTATCCTCTCCGGAAATGGCTCAAGAAGTTCTTAAAACCCATGAAACTCTTTTCTTGAACCGACCAAACATTGCCAACATCGACTACCTTACATATGGTTCGTCTGATTTAACAAGGGCACCTTATGGACCTTACTGGAAGTTCATGAAAAAGATTTGCATGTCTGAGCTTCTCAACGGTCAAACAGTAGAGCAGTTTCGACCAATTAGACAGGAGGAAGTGAACCGGTTTCTGCAGCAAATATTGAGTAAAGCCAAAGCAGGTGAGACATTTGATGTAAGAACAGGGATTGTGAGGCTGACAAGTAATGTGATTTCAAGGATGGCTTTGACACATAGGTCTTGGTTTGGCAATTACAAGACTGATGAGATGAGGCAATTAGTTGGAGAGATGAATGACCTTGTTGGAAAAGGTAGTTTATTAGATTTGATTTGGTTTCTTAAGAATCTGGATTTGCAGGGATTGCGTAAAAGACTCAAGAATGCTCGTGATCAGTACAACAATGTGATGGAAGGTATTATAAAAGAGTGTGAAGAGGTGAAGAGAAAAAGGAAGGAGTCCAGTAACGGAAATAACCCAAGAAAAGATGTACTTGAAAGCTTGCTTGATATTTATGAAGATTCGAGCTCAGAGATTAAACTGACAAGAGAAAATGTTAAGGCCTTAATAATGAACTTGCTTGGGGCAGGAACAGACTCAATTGCCTTTGCAATAGAATGGGCATTTTCAGAACTGATCAACAATCCAGAAGTGATGGAAAAGGCAAGAAAAGAGATTGATTCTGTAGTTGGGAAAACAAGAATTGTAGAGGAAACAGACATTCCTAACCTTCCCTACATTCAAGCAATAGTAAAAGAATCATTAAGGCTGCACCCCACTGGTCCCCTGTTTAGCAGAGAATCAAGCGAGGAATGCACCATCAAATGCTACAAAATTCCAGCAAAAACCAAGCTCATTGTTAATATATGGTCAATTAATAGAGACCCAAACCACTGGGAAAACCCACTGGAGTTTAGGCCAGAGAGATTTATAAATGAAGAATGGAATGAAAAGAAACAGTTCATGGACGTGAGAGGACAGTGTTTTAGTCTATTGCCTTTTGGGGCTGGAAGAAGAAGCTGTCCCGGCTCATTTCTAGCATTACAGGTTATGATGACGACCCTTGGTGCAATGGTTCAGTGCTTTGAATGGAAGGTTAATGGGGATGGTGAGAATGAGACTGTTGACATGGAAGAGGCACCTGGATTATCACTTAAGAGGGTTCACCCCTTAATCTGTGTTCCCGTTGCAAGGCTCAGTCCAATCCCCTTAGTGTGTAGTCAGGCAAACCCAATGTGA evm.model.supercontig_85.106 ATGACTGAGCTTCTTAATGGTCGAATAGTAGATCAATTTCGGCCAATTAGACGGGTGGAGATGAACCAGTTTCTGCGGCTGATATTAAACAAAGCTAAAGCAGGTGAGGCATTTGATGTTGGAGCAGTGATTGCCAGGCTGACAAATAATATAATTTCAAGGATGGCTTCGACACAGAGGTGTTCTTGCACCGATGACAAGGCTGATAAGATAAGGAAATTGGTTGGAGAGCTGACTGATCTTGTTGGGAAATGTAGTTTATTAGATTTAATCTGGTTTGTTAAGAATCTGGATCTGCAGGGACTCAAGAATGCTCGTGATAGGTACGACAACATGATGGAAGAAATTATAAAGGAGCATGAAGAGGTAAAGAGAAAAAGCAAGGAGTCTGGCGACGGAAATCGCCCAACAAAAGATATACATGAAAGCTTGCTTGACATTTATGAAGAAGAGAGCTCAGAAATTAAACTGACAAGAGAAAATATTAAGGCCCTTATTATGAACATATTTGGGGCTGGGACAGACTCAACTTCCATCACAATCGAACGGGCATTTTCCGAGCTAATTAACAACCCCAAAGTGATGGAGAAGGCAAGAAAAGACATTGATTCTGTTGTTGGCAAAAGCAGAGTTGTAGAGGAAACAGATATTCCTAACCTTCCTTACATTCAGGCAATAATTCCAGCAAAAACTAAGCTCATTCTTAATGTGTGGTCACTTGGGAGAGACCCAAACCACTGGGAAAACCCACTGGAGTTTAGGCCAGAGAGATTTATAAATGAAGGATGGAATGAAAAGAAGCAGTTTATGGATGTGAGAGGACAGCATTTTGGTCTCTTGCCATTTGGGACTGGAAGAAGAAGCTGTCCGGGCTCGTTACTGGCATTACAAGTTATTATGACTACCCTTGCTACAATGATTCATTGCTTTGAGCGGAAGGTTAATGGGGGTGGTGAGAATGAGACTGTTGACATGGAAGAAGCACCTGGATTATCTCTTAAGAGGATCCGTCCATTGGTCTGTGCCCCAGTTGCAAGGCTCTATCCAATCCCCTCCGTTTTGAGTCGGCCAAACCCATTGTGA (base) cat evm.model.supercontig_85.105-evm.model.supercontig_85.106.pep evm.model.supercontig_85.105 MTMVDVQNCTILFLAWLASIALFRTIWTKLKARAQLPPGPRPLPIIGNQHLLRPIAHQALHKLSLNYGPLMCLFFGSKPCVVVSSPEMAQEVLKTHETLFLNRPNIANIDYLTYGSSDLTRAPYGPYWKFMKKICMSELLNGQTVEQFRPIRQEEVNRFLQQILSKAKAGETFDVRTGIVRLTSNVISRMALTHRSWFGNYKTDEMRQLVGEMNDLVGKGSLLDLIWFLKNLDLQGLRKRLKNARDQYNNVMEGIIKECEEVKRKRKESSNGNNPRKDVLESLLDIYEDSSSEIKLTRENVKALIMNLLGAGTDSIAFAIEWAFSELINNPEVMEKARKEIDSVVGKTRIVEETDIPNLPYIQAIVKESLRLHPTGPLFSRESSEECTIKCYKIPAKTKLIVNIWSINRDPNHWENPLEFRPERFINEEWNEKKQFMDVRGQCFSLLPFGAGRRSCPGSFLALQVMMTTLGAMVQCFEWKVNGDGENETVDMEEAPGLSLKRVHPLICVPVARLSPIPLVCSQANPM evm.model.supercontig_85.106 MTELLNGRIVDQFRPIRRVEMNQFLRLILNKAKAGEAFDVGAVIARLTNNIISRMASTQRCSCTDDKADKIRKLVGELTDLVGKCSLLDLIWFVKNLDLQGLKNARDRYDNMMEEIIKEHEEVKRKSKESGDGNRPTKDIHESLLDIYEEESSEIKLTRENIKALIMNIFGAGTDSTSITIERAFSELINNPKVMEKARKDIDSVVGKSRVVEETDIPNLPYIQAIIPAKTKLILNVWSLGRDPNHWENPLEFRPERFINEGWNEKKQFMDVRGQHFGLLPFGTGRRSCPGSLLALQVIMTTLATMIHCFERKVNGGGENETVDMEEAPGLSLKRIRPLVCAPVARLYPIPSVLSRPNPL

Dee-chen commented 1 year ago

Hello, after checking, the problem you reported should be due to the fact that trimAl software is not installed. In v1.0.41 of Tree2gd, I added the function of using trimAl to filter the sequence alignment results. Unlike other software, I have not added the precompiled trimAl in the Tree2gd installation package. At present, you can use conda to install trimAl, or download the pre compiled version I just uploaded and add it to your PATH( https://github.com/Dee-chen/Tree2gd/tree/master/tree2gd/software/trimal )。 After that, it can be used normally.

I'll be fixing the bugs I've collected so far in the near future, uploading a full new version of Tree2gd in a few days, and then you'll be able to use Tree2gd normally through pypi updates.

Again, I'm sorry for causing you trouble with Tree2gd analysis.

Dee-chen commented 1 year ago

Hello, the problem you encountered has been fixed in V1.0.43. You can use the pypi to update and then use Tree2gd normally.

ewcro commented 1 year ago

Thank you!

From: "Dee-chen" @.> To: "Dee-chen/Tree2gd" @.> Cc: "ewcro" @.>, "Mention" @.> Sent: Tuesday, November 1, 2022 6:03:43 PM Subject: [Newsletter] Re: [Dee-chen/Tree2gd] running error during step5.KaKs (Issue #4)

Hello, the problem you encountered has been fixed in V1.0.43. You can use the pypi to update and then use Tree2gd normally.

— Reply to this email directly, [ https://github.com/Dee-chen/Tree2gd/issues/4#issuecomment-1298839179 | view it on GitHub ] , or [ https://github.com/notifications/unsubscribe-auth/AV5UGFCHPXZ6EYGQWDXDCDDWGFEO7ANCNFSM5UELQTAQ | unsubscribe ] . You are receiving this because you were mentioned. Message ID: @.***>

sjfleck commented 1 year ago

@Dee-chen I'm having the same issue. I should have version 1.0.43, but I'm still getting this error. I installed Tree2GD using:

module load anaconda3 pip3 install Tree2gd

and I'm running the same command twice. Once with the --synteny option and one without. They both had errors on step 4 (probably memory or that I reached the file number limit on the partition that I was working on). I resubmitted the jobs with --step 456. They completed step 4, but had the error explained by myBioFun previously:

multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/full/path/to/pool.py", line 125, in worker result = (True, func(*args, *kwds)) File "/full/path/to/pool.py", line 48, in mapstar return list(map(args)) File "/full/path/to/kaks.py", line 85, in run_ks sub_sh(args[0],args[1],args[2],args[3],args[4],args[5],args[6],args[7]) File "/full/path/to/kaks.py", line 126, in sub_sh axt2oneline((pair[2]+"-"+pair[3]+".cds_aln.axt"),(pair[2]+"-"+pair[3]+".one-line")) File "/full/path/to/kaks.py", line 159, in axt2oneline mydict[header].append(line.strip()) UnboundLocalError: local variable 'header' referenced before assignment """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/full/path/to/Tree2gd", line 8, in sys.exit(main()) File "/full/path/to/tree2gd_main.py", line 271, in main run_kaks(sp_list,step1out,args,cf,step4out,step5out,gene_pairs_idmap) File "/full/path/to/kaks.py", line 64, in run_kaks sh_pool.map(run_ks,arg_list) File "/full/path/to/pool.py", line 364, in map return self._map_async(func, iterable, mapstar, chunksize).get() File "/full/path/to/pool.py", line 771, in get raise self._value UnboundLocalError: local variable 'header' referenced before assignment

Any help would be greatly appreciated. Thank you

andrzej-grz commented 1 year ago

multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/full/path/to/pool.py", line 125, in worker result = (True, func(*args, *kwds)) File "/full/path/to/pool.py", line 48, in mapstar return list(map(args)) File "/full/path/to/kaks.py", line 85, in run_ks sub_sh(args[0],args[1],args[2],args[3],args[4],args[5],args[6],args[7]) File "/full/path/to/kaks.py", line 126, in sub_sh axt2oneline((pair[2]+"-"+pair[3]+".cds_aln.axt"),(pair[2]+"-"+pair[3]+".one-line")) File "/full/path/to/kaks.py", line 159, in axt2oneline mydict[header].append(line.strip()) UnboundLocalError: local variable 'header' referenced before assignment """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/full/path/to/Tree2gd", line 8, in sys.exit(main()) File "/full/path/to/tree2gd_main.py", line 271, in main run_kaks(sp_list,step1out,args,cf,step4out,step5out,gene_pairs_idmap) File "/full/path/to/kaks.py", line 64, in run_kaks sh_pool.map(run_ks,arg_list) File "/full/path/to/pool.py", line 364, in map return self._map_async(func, iterable, mapstar, chunksize).get() File "/full/path/to/pool.py", line 771, in get raise self._value UnboundLocalError: local variable 'header' referenced before assignment

The same situation in my case. I have the latest version.

sjfleck commented 1 year ago

@andrzej-grz if you end up finding a solution, please post it here. I'd still like to use this program in the future.