arzwa / wgd

Python package and CLI for whole-genome duplication related analyses. This package is deprecated in favor of https://github.com/heche-psb/wgd.
http://wgd.readthedocs.io/en/latest/
GNU General Public License v3.0
81 stars 41 forks source link

NewickError: Unexisting tree file or Malformed newick tree structure. #14

Closed asher-616 closed 5 years ago

asher-616 commented 5 years ago

I installed WGD with all prerequisites. I tried the following command after preparing the mcl file using the [ashermoshe@login-0-0 ~/dorothee]$ wgd ksd schlosseri.mcl Botryllus_schlosseri.fas command, but I got an error somewhere downstream. I would appreciate your help in understanding how to approach it.

2019-02-27 17:18:23: INFO
2019-02-27 17:18:23: INFO       codeml found
2019-02-27 17:18:23: INFO       MUSCLE v3.7 by Robert C. Edgar
2019-02-27 17:18:23: INFO
2019-02-27 17:18:23: WARNING    Output directory exists, will possibly overwrite
2019-02-27 17:18:24: INFO       Translating CDS file
100% (65587 of 65587) |##################################################################################################################| Elapsed Time: 0:00:17 Time:  0:00:17
2019-02-27 17:18:42: WARNING    There were 0 warnings during translation
2019-02-27 17:18:42: INFO       Started whole paranome Ks analysis
2019-02-27 17:18:42: WARNING    Filtered out the 5 largest gene families because n*(n-1)/2 > `max_pairwise`
2019-02-27 17:18:42: WARNING    If you want to analyse these large families anyhow, please raise the `max_pairwise` parameter.
2019-02-27 17:18:42: INFO       Started analysis in parallel (n_threads = 4)
2019-02-27 17:18:42: INFO       Performing analysis on gene family GF_000006
2019-02-27 17:18:43: INFO       Performing analysis on gene family GF_000007
2019-02-27 17:18:43: INFO       Performing analysis on gene family GF_000008
2019-02-27 17:18:43: INFO       Performing analysis on gene family GF_000009
2019-02-27 17:18:49: INFO       Performing analysis on gene family GF_000010
2019-02-27 17:19:12: INFO       Performing analysis on gene family GF_000011
2019-02-27 17:19:13: INFO       Performing analysis on gene family GF_000012
2019-02-27 17:21:05: INFO       Performing analysis on gene family GF_000013
2019-02-27 17:21:57: INFO       Performing analysis on gene family GF_000014
2019-02-27 17:22:08: INFO       Performing analysis on gene family GF_000015
2019-02-27 17:22:35: INFO       Performing analysis on gene family GF_000016
2019-02-27 17:23:03: INFO       Performing analysis on gene family GF_000017
2019-02-27 17:23:22: INFO       Performing analysis on gene family GF_000018
2019-02-27 17:23:50: INFO       Performing analysis on gene family GF_000019
2019-02-27 17:24:02: INFO       Performing analysis on gene family GF_000020
2019-02-27 17:26:30: INFO       Performing analysis on gene family GF_000021
2019-02-27 17:26:35: INFO       Performing analysis on gene family GF_000022
2019-02-27 17:26:40: INFO       Performing analysis on gene family GF_000023
2019-02-27 17:26:42: INFO       Performing analysis on gene family GF_000024
2019-02-27 17:26:50: INFO       Performing analysis on gene family GF_000025
2019-02-27 17:26:59: INFO       Performing analysis on gene family GF_000026
2019-02-27 17:39:17: INFO       Performing analysis on gene family GF_000027
2019-02-27 17:39:17: INFO       Performing analysis on gene family GF_000028
2019-02-27 17:40:25: INFO       Performing analysis on gene family GF_000029
2019-02-27 17:40:52: INFO       Performing analysis on gene family GF_000030
2019-02-27 17:41:04: INFO       Performing analysis on gene family GF_000031
2019-02-27 17:41:09: INFO       Performing analysis on gene family GF_000032
2019-02-27 17:48:02: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN77472_c4_g3::TRINITY_DN77472_c4_g3_i12::g.41868::m.41868 - Botryllus_schlosseri_TRINITY_DN77472_c4_g3::TRINITY_DN77472_c4_g3_i7::g.41841::m.41841!
2019-02-27 17:48:02: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN82389_c1_g1::TRINITY_DN82389_c1_g1_i1::g.79015::m.79015 - Botryllus_schlosseri_TRINITY_DN82597_c2_g1::TRINITY_DN82597_c2_g1_i1::g.145711::m.145711!
2019-02-27 17:48:03: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN82579_c2_g2::TRINITY_DN82579_c2_g2_i2::g.147899::m.147899 - Botryllus_schlosseri_TRINITY_DN82597_c2_g1::TRINITY_DN82597_c2_g1_i1::g.145711::m.145711!
2019-02-27 17:48:03: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN82579_c2_g2::TRINITY_DN82579_c2_g2_i2::g.147899::m.147899 - Botryllus_schlosseri_TRINITY_DN82389_c1_g1::TRINITY_DN82389_c1_g1_i1::g.79015::m.79015!
2019-02-27 17:48:03: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN79293_c3_g3::TRINITY_DN79293_c3_g3_i1::g.306985::m.306985 - Botryllus_schlosseri_TRINITY_DN70319_c5_g1::TRINITY_DN70319_c5_g1_i6::g.86980::m.86980!
2019-02-27 17:48:03: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN80015_c3_g1::TRINITY_DN80015_c3_g1_i11::g.69428::m.69428 - Botryllus_schlosseri_TRINITY_DN83038_c1_g1::TRINITY_DN83038_c1_g1_i3::g.203559::m.203559!
2019-02-27 17:48:03: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN72881_c1_g2::TRINITY_DN72881_c1_g2_i9::g.188509::m.188509 - Botryllus_schlosseri_TRINITY_DN76803_c3_g2::TRINITY_DN76803_c3_g2_i4::g.291848::m.291848!
2019-02-27 17:48:03: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN82877_c1_g1::TRINITY_DN82877_c1_g1_i7::g.185179::m.185179 - Botryllus_schlosseri_TRINITY_DN83038_c1_g1::TRINITY_DN83038_c1_g1_i3::g.203559::m.203559!
2019-02-27 17:48:03: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN82877_c1_g1::TRINITY_DN82877_c1_g1_i7::g.185179::m.185179 - Botryllus_schlosseri_TRINITY_DN76803_c3_g2::TRINITY_DN76803_c3_g2_i4::g.291848::m.291848!
2019-02-27 17:48:03: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN82877_c1_g1::TRINITY_DN82877_c1_g1_i7::g.185179::m.185179 - Botryllus_schlosseri_TRINITY_DN80015_c3_g1::TRINITY_DN80015_c3_g1_i11::g.69428::m.69428!
2019-02-27 17:48:03: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN65697_c0_g1::TRINITY_DN65697_c0_g1_i1::g.293174::m.293174 - Botryllus_schlosseri_TRINITY_DN83038_c1_g1::TRINITY_DN83038_c1_g1_i3::g.203559::m.203559!
2019-02-27 17:48:03: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN65697_c0_g1::TRINITY_DN65697_c0_g1_i1::g.293174::m.293174 - Botryllus_schlosseri_TRINITY_DN80015_c3_g1::TRINITY_DN80015_c3_g1_i11::g.69428::m.69428!
2019-02-27 17:48:03: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN65697_c0_g1::TRINITY_DN65697_c0_g1_i1::g.293174::m.293174 - Botryllus_schlosseri_TRINITY_DN82877_c1_g1::TRINITY_DN82877_c1_g1_i7::g.185179::m.185179!
2019-02-27 17:48:03: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN65038_c0_g1::TRINITY_DN65038_c0_g1_i1::g.106604::m.106604 - Botryllus_schlosseri_TRINITY_DN79594_c0_g1::TRINITY_DN79594_c0_g1_i1::g.278650::m.278650!
2019-02-27 17:48:03: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN65038_c0_g1::TRINITY_DN65038_c0_g1_i1::g.106604::m.106604 - Botryllus_schlosseri_TRINITY_DN76803_c3_g2::TRINITY_DN76803_c3_g2_i4::g.291848::m.291848!
2019-02-27 17:48:03: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN65038_c0_g1::TRINITY_DN65038_c0_g1_i2::g.106605::m.106605 - Botryllus_schlosseri_TRINITY_DN79594_c0_g1::TRINITY_DN79594_c0_g1_i1::g.278650::m.278650!
2019-02-27 17:48:03: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN65038_c0_g1::TRINITY_DN65038_c0_g1_i2::g.106605::m.106605 - Botryllus_schlosseri_TRINITY_DN76803_c3_g2::TRINITY_DN76803_c3_g2_i4::g.291848::m.291848!
2019-02-27 17:48:03: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN65038_c0_g1::TRINITY_DN65038_c0_g1_i2::g.106605::m.106605 - Botryllus_schlosseri_TRINITY_DN65038_c0_g1::TRINITY_DN65038_c0_g1_i1::g.106604::m.106604!
2019-02-27 17:48:03: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN83231_c0_g1::TRINITY_DN83231_c0_g1_i1::g.112228::m.112228 - Botryllus_schlosseri_TRINITY_DN76803_c3_g2::TRINITY_DN76803_c3_g2_i4::g.291848::m.291848!
2019-02-27 17:48:03: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN83231_c0_g1::TRINITY_DN83231_c0_g1_i1::g.112228::m.112228 - Botryllus_schlosseri_TRINITY_DN82877_c1_g1::TRINITY_DN82877_c1_g1_i7::g.185179::m.185179!
2019-02-27 17:48:03: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN80868_c1_g1::TRINITY_DN80868_c1_g1_i17::g.220347::m.220347 - Botryllus_schlosseri_TRINITY_DN83038_c1_g1::TRINITY_DN83038_c1_g1_i3::g.203559::m.203559!
2019-02-27 17:48:03: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN80868_c1_g1::TRINITY_DN80868_c1_g1_i17::g.220347::m.220347 - Botryllus_schlosseri_TRINITY_DN76803_c3_g2::TRINITY_DN76803_c3_g2_i4::g.291848::m.291848!
2019-02-27 17:48:03: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN80868_c1_g1::TRINITY_DN80868_c1_g1_i17::g.220347::m.220347 - Botryllus_schlosseri_TRINITY_DN80015_c3_g1::TRINITY_DN80015_c3_g1_i11::g.69428::m.69428!
2019-02-27 17:48:03: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN80868_c1_g1::TRINITY_DN80868_c1_g1_i17::g.220347::m.220347 - Botryllus_schlosseri_TRINITY_DN82877_c1_g1::TRINITY_DN82877_c1_g1_i7::g.185179::m.185179!
2019-02-27 17:48:03: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN80868_c1_g1::TRINITY_DN80868_c1_g1_i17::g.220347::m.220347 - Botryllus_schlosseri_TRINITY_DN65697_c0_g1::TRINITY_DN65697_c0_g1_i1::g.293174::m.293174!
2019-02-27 17:48:03: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN80868_c1_g1::TRINITY_DN80868_c1_g1_i17::g.220347::m.220347 - Botryllus_schlosseri_TRINITY_DN83231_c0_g1::TRINITY_DN83231_c0_g1_i1::g.112228::m.112228!
2019-02-27 17:48:03: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN83038_c0_g1::TRINITY_DN83038_c0_g1_i2::g.203556::m.203556 - Botryllus_schlosseri_TRINITY_DN76803_c3_g2::TRINITY_DN76803_c3_g2_i4::g.291848::m.291848!
2019-02-27 17:48:03: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN83038_c0_g1::TRINITY_DN83038_c0_g1_i2::g.203556::m.203556 - Botryllus_schlosseri_TRINITY_DN82877_c1_g1::TRINITY_DN82877_c1_g1_i7::g.185179::m.185179!
2019-02-27 17:48:03: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN83038_c0_g1::TRINITY_DN83038_c0_g1_i2::g.203556::m.203556 - Botryllus_schlosseri_TRINITY_DN83231_c0_g1::TRINITY_DN83231_c0_g1_i1::g.112228::m.112228!
2019-02-27 17:48:03: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN83038_c0_g1::TRINITY_DN83038_c0_g1_i2::g.203556::m.203556 - Botryllus_schlosseri_TRINITY_DN80868_c1_g1::TRINITY_DN80868_c1_g1_i17::g.220347::m.220347!
2019-02-27 17:48:03: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN81132_c0_g1::TRINITY_DN81132_c0_g1_i4::g.47004::m.47004 - Botryllus_schlosseri_TRINITY_DN77472_c4_g3::TRINITY_DN77472_c4_g3_i7::g.41841::m.41841!
2019-02-27 17:48:03: WARNING    No Ks value for Botryllus_schlosseri_TRINITY_DN81132_c0_g1::TRINITY_DN81132_c0_g1_i4::g.47004::m.47004 - Botryllus_schlosseri_TRINITY_DN77472_c4_g3::TRINITY_DN77472_c4_g3_i12::g.41868::m.41868!
2019-02-27 17:48:05: INFO       Performing analysis on gene family GF_000033
2019-02-27 17:51:44: INFO       Performing analysis on gene family GF_000034
2019-02-27 17:51:56: INFO       Performing analysis on gene family GF_000035
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 350, in __call__
    return self.func(*args, **kwargs)
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/joblib/parallel.py", line 131, in __call__
    return [func(*args, **kwargs) for func, args, kwargs in self.items]
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/joblib/parallel.py", line 131, in <listcomp>
    return [func(*args, **kwargs) for func, args, kwargs in self.items]
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/wgd/ks_distribution.py", line 303, in analyse_family
    results_dict, msa=msa_path_protein, method=method)
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/wgd/ks_distribution.py", line 98, in _weighting
    tree_path, pairwise_estimates['Ks'])
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/wgd/phy.py", line 123, in phylogenetic_tree_to_cluster_format
    t = Tree(tree)
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/ete3/coretype/tree.py", line 211, in __init__
    quoted_names=quoted_node_names)
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/ete3/parser/newick.py", line 249, in read_newick
    raise NewickError('Unexisting tree file or Malformed newick tree structure.')
ete3.parser.newick.NewickError: Unexisting tree file or Malformed newick tree structure.
You may want to check other newick loading flags like 'format' or 'quoted_node_names'.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 359, in __call__
    raise TransportableException(text, e_type)
joblib.my_exceptions.TransportableException: TransportableException
___________________________________________________________________________
NewickError                                        Wed Feb 27 18:10:41 2019
PID: 62472             Python 3.6.8: /share/apps/anaconda3-5.1.0/bin/python
...........................................................................
/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/joblib/parallel.py in __call__(self=<joblib.parallel.BatchedCalls object>)
    126     def __init__(self, iterator_slice):
    127         self.items = list(iterator_slice)
    128         self._size = len(self.items)
    129
    130     def __call__(self):
--> 131         return [func(*args, **kwargs) for func, args, kwargs in self.items]
        self.items = [(<function analyse_family>, ('GF_000012', {'Botryllus_schlosseri_TRINITY_DN47590_c0_g1::TRINITY_DN47590_c0_g1_i1::g.253656::m.253656': 'MGETKSTITNLYGHNNRAKRRTQLRELAIRTRTEKSHLIAGDFNGIDS...ARRQKDQLRSDEKEMQKVRNRVIQNQQYLSDYRKIKKKIWKEKAESLHK', 'Botryllus_schlosseri_TRINITY_DN50048_c0_g1::TRINITY_DN50048_c0_g1_i1::g.248848::m.248848': 'GAVISIDFLKAYDSVDHSFLHNTLEEAGFGVKVRAFFKAIYQGGSAKV...SGMKGKIATPSYADDVTITLAKEEESTKALQIVAEFGKASGLQINRKKT', 'Botryllus_schlosseri_TRINITY_DN50048_c0_g2::TRINITY_DN50048_c0_g2_i1::g.248849::m.248849': 'QKTTSQFARGIIKTIFKKGDKEDIRNYRPITILNVDYKIISKVITNRIQKVLPTITHRHQFINPPNTIGDLNLLLREVTSDMRERSRGA', 'Botryllus_schlosseri_TRINITY_DN50048_c0_g3::TRINITY_DN50048_c0_g3_i1::g.248850::m.248850': 'AGDFNGIDDIELDRDPVNIRHDAADAKYSKRVMEVIGVTDAFRQVHGS...CNHMPCPFSDHGATTALVKLTDHRPRRPNTWKNNTKVYEMEAFETELEV', 'Botryllus_schlosseri_TRINITY_DN50048_c0_g4::TRINITY_DN50048_c0_g4_i1::g.248851::m.248851': 'KTWADKATDLGKLQLEAKADAEEALGKRPHLLCEKIKVRRDAVSITAIKDATGKTTEDPEEIRETVEEFYQKLYSKRETDKCTANSFHRYQDAKLSTR', 'Botryllus_schlosseri_TRINITY_DN50200_c0_g1::TRINITY_DN50200_c0_g1_i1::g.61823::m.61823': 'GEDGLSSELYMVNLDLMKKELTEVYNEIYEAQGTTTSLGRAVLKIIHK...KNYRPISLLNSDYKILSKILTNRLKQALPSITHQHQHVNPPKTIGQINL', 'Botryllus_schlosseri_TRINITY_DN52423_c0_g2::TRINITY_DN52423_c0_g2_i1::g.262149::m.262149': 'GKANISIGGKLGGNIRLGRGIKQGDPISMLLFTMATDPLLQRLNHDLD...DVNITLAHQADVNEALKIIQDFEEASALKLNKNKSKGITYHPKPPPGSK', 'Botryllus_schlosseri_TRINITY_DN52423_c0_g3::TRINITY_DN52423_c0_g3_i1::g.262150::m.262150': 'PPPGSKNVLKWVQSMEVLGHVINRHPPNDHETWNGLANKAKDLMREIK...YVATLKKMPINTRRELETAVTELLFGKSMRPDYRKLIQRREAGGIGLVD', 'Botryllus_schlosseri_TRINITY_DN58171_c0_g1::TRINITY_DN58171_c0_g1_i1::g.196397::m.196397': 'LEYEIVEFLFGKGKRPEYKKLVQQEIAGGREVKDIPTITDIIFIKPAV...RTDHQLALTLGWLRERPINNSRPHTWQPRQHWAEMAKIMKEMEYKRDYI', 'Botryllus_schlosseri_TRINITY_DN61293_c1_g1::TRINITY_DN61293_c1_g1_i1::g.134659::m.134659': 'AAFVMDLDGKIEYEKIIQARESGGLELVDIPTMTDLAFVKPALRYLQR...MKRYGMRKINNAIPHVFQPLQHWQETEKTMRSLGRQQQDIKSKRRERYR', ...}, {'Botryllus_schlosseri_TRINITY_DN10080_c0_g1::TRINITY_DN10080_c0_g1_i1::g.308497::m.308497': 'CTTAATTGTGCTTCATTTTTTAGCTCTCAAATGGCTCGATCAACGAAA...CATGGCGGAAGGCACGAAGGCCGTCGCCAAGCTCGCTGCCAGCAAATAA', 'Botryllus_schlosseri_TRINITY_DN10087_c0_g1::TRINITY_DN10087_c0_g1_i1::g.308496::m.308496': 'ATGGCGGTGGTGACACTGTTCTCGGTGGGGCATTCTAGGGAGGTGCGG...CTACCTCGACCCTCGGCTAGACCAACCATGGCCCCGCGTCGCGAAGGCC', 'Botryllus_schlosseri_TRINITY_DN1013_c0_g1::TRINITY_DN1013_c0_g1_i1::g.248880::m.248880': 'TGGGCAGTCCTGACTGGCACTATCAAGAATAACAGCAACATCCAATGC...GAAAAACTATGGGGACATTATTAACCCTGCTGATGGAATCTGTTTGACC', 'Botryllus_schlosseri_TRINITY_DN1013_c0_g2::TRINITY_DN1013_c0_g2_i1::g.248881::m.248881': 'CTGATTCTTCTCATCGGCTCTCTCTGCTTCACTCTGACCCACGGCCTT...TCTGAAGGCCTACGCGATTGCCAAGACGGCCAAAGACTCTTACAACTGG', 'Botryllus_schlosseri_TRINITY_DN10166_c0_g2::TRINITY_DN10166_c0_g2_i1::g.27418::m.27418': 'GTACATTTGAAGAAAATGTCTGAATTGATCCATCACGAAAAAGCTTTC...GGTCATCTCGAACAAAATGCACAGAACCATCGTGATCAGAAGAGACTAC', 'Botryllus_schlosseri_TRINITY_DN1017_c0_g1::TRINITY_DN1017_c0_g1_i1::g.248877::m.248877': 'TGCCATCTTCCGGCACTCATGGCCAAGATTGACGAGGTGCACCACTCG...CCAAGTTAAGACCTTGCCCACTGAGAAGCATACCCAGCCGGAGTGCTAG', 'Botryllus_schlosseri_TRINITY_DN101_c0_g1::TRINITY_DN101_c0_g1_i1::g.75736::m.75736': 'GAGCGATATCACGCTCGCTCGTTCAGCAAGCGCGAACGGCTGTATCGG...GAACCGGACCTTCGTCGATCTACCTGAGTTCATAGATGAAATGGATCCG', 'Botryllus_schlosseri_TRINITY_DN10203_c0_g1::TRINITY_DN10203_c0_g1_i1::g.234676::m.234676': 'GAAGGGCGGTTCGTGCCGCACCTGGTGGTCCGCATCGACGCCGGCACG...GGTCAACACAGATAAGGTATCGGCTACCACGGCCATCGAGGTCTTGCGC', 'Botryllus_schlosseri_TRINITY_DN10220_c0_g2::TRINITY_DN10220_c0_g2_i1::g.234678::m.234678': 'GATCATCGGTATGGACTACCCGTGGAAATCGAGATGACTACAATGGAC...TCGAGTGGTGCTAAATGCTGCTACTGATACATTTCGTGGTATACTAGAT', 'Botryllus_schlosseri_TRINITY_DN10249_c0_g1::TRINITY_DN10249_c0_g1_i1::g.234674::m.234674': 'GAAGGACTCTCGTTGATCGAACGCCGCAACTGCATGGAGCTGTTCCGA...GGATCGTCTCGACAAGGTTTACGCCAGGCACAGCGACATGGTCCTGTTG', ...}, '/groups/pupko/ashermoshe/dorothee/ks_tmp.371cd0d9072862', 'codeml', False, 1, 100, 'fasttree', 'muscle', '/groups/pupko/ashermoshe/dorothee/wgd_ksd'), {})]
    132
    133     def __len__(self):
    134         return self._size
    135

...........................................................................
/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/joblib/parallel.py in <listcomp>(.0=<list_iterator object>)
    126     def __init__(self, iterator_slice):
    127         self.items = list(iterator_slice)
    128         self._size = len(self.items)
    129
    130     def __call__(self):
--> 131         return [func(*args, **kwargs) for func, args, kwargs in self.items]
        func = <function analyse_family>
        args = ('GF_000012', {'Botryllus_schlosseri_TRINITY_DN47590_c0_g1::TRINITY_DN47590_c0_g1_i1::g.253656::m.253656': 'MGETKSTITNLYGHNNRAKRRTQLRELAIRTRTEKSHLIAGDFNGIDS...ARRQKDQLRSDEKEMQKVRNRVIQNQQYLSDYRKIKKKIWKEKAESLHK', 'Botryllus_schlosseri_TRINITY_DN50048_c0_g1::TRINITY_DN50048_c0_g1_i1::g.248848::m.248848': 'GAVISIDFLKAYDSVDHSFLHNTLEEAGFGVKVRAFFKAIYQGGSAKV...SGMKGKIATPSYADDVTITLAKEEESTKALQIVAEFGKASGLQINRKKT', 'Botryllus_schlosseri_TRINITY_DN50048_c0_g2::TRINITY_DN50048_c0_g2_i1::g.248849::m.248849': 'QKTTSQFARGIIKTIFKKGDKEDIRNYRPITILNVDYKIISKVITNRIQKVLPTITHRHQFINPPNTIGDLNLLLREVTSDMRERSRGA', 'Botryllus_schlosseri_TRINITY_DN50048_c0_g3::TRINITY_DN50048_c0_g3_i1::g.248850::m.248850': 'AGDFNGIDDIELDRDPVNIRHDAADAKYSKRVMEVIGVTDAFRQVHGS...CNHMPCPFSDHGATTALVKLTDHRPRRPNTWKNNTKVYEMEAFETELEV', 'Botryllus_schlosseri_TRINITY_DN50048_c0_g4::TRINITY_DN50048_c0_g4_i1::g.248851::m.248851': 'KTWADKATDLGKLQLEAKADAEEALGKRPHLLCEKIKVRRDAVSITAIKDATGKTTEDPEEIRETVEEFYQKLYSKRETDKCTANSFHRYQDAKLSTR', 'Botryllus_schlosseri_TRINITY_DN50200_c0_g1::TRINITY_DN50200_c0_g1_i1::g.61823::m.61823': 'GEDGLSSELYMVNLDLMKKELTEVYNEIYEAQGTTTSLGRAVLKIIHK...KNYRPISLLNSDYKILSKILTNRLKQALPSITHQHQHVNPPKTIGQINL', 'Botryllus_schlosseri_TRINITY_DN52423_c0_g2::TRINITY_DN52423_c0_g2_i1::g.262149::m.262149': 'GKANISIGGKLGGNIRLGRGIKQGDPISMLLFTMATDPLLQRLNHDLD...DVNITLAHQADVNEALKIIQDFEEASALKLNKNKSKGITYHPKPPPGSK', 'Botryllus_schlosseri_TRINITY_DN52423_c0_g3::TRINITY_DN52423_c0_g3_i1::g.262150::m.262150': 'PPPGSKNVLKWVQSMEVLGHVINRHPPNDHETWNGLANKAKDLMREIK...YVATLKKMPINTRRELETAVTELLFGKSMRPDYRKLIQRREAGGIGLVD', 'Botryllus_schlosseri_TRINITY_DN58171_c0_g1::TRINITY_DN58171_c0_g1_i1::g.196397::m.196397': 'LEYEIVEFLFGKGKRPEYKKLVQQEIAGGREVKDIPTITDIIFIKPAV...RTDHQLALTLGWLRERPINNSRPHTWQPRQHWAEMAKIMKEMEYKRDYI', 'Botryllus_schlosseri_TRINITY_DN61293_c1_g1::TRINITY_DN61293_c1_g1_i1::g.134659::m.134659': 'AAFVMDLDGKIEYEKIIQARESGGLELVDIPTMTDLAFVKPALRYLQR...MKRYGMRKINNAIPHVFQPLQHWQETEKTMRSLGRQQQDIKSKRRERYR', ...}, {'Botryllus_schlosseri_TRINITY_DN10080_c0_g1::TRINITY_DN10080_c0_g1_i1::g.308497::m.308497': 'CTTAATTGTGCTTCATTTTTTAGCTCTCAAATGGCTCGATCAACGAAA...CATGGCGGAAGGCACGAAGGCCGTCGCCAAGCTCGCTGCCAGCAAATAA', 'Botryllus_schlosseri_TRINITY_DN10087_c0_g1::TRINITY_DN10087_c0_g1_i1::g.308496::m.308496': 'ATGGCGGTGGTGACACTGTTCTCGGTGGGGCATTCTAGGGAGGTGCGG...CTACCTCGACCCTCGGCTAGACCAACCATGGCCCCGCGTCGCGAAGGCC', 'Botryllus_schlosseri_TRINITY_DN1013_c0_g1::TRINITY_DN1013_c0_g1_i1::g.248880::m.248880': 'TGGGCAGTCCTGACTGGCACTATCAAGAATAACAGCAACATCCAATGC...GAAAAACTATGGGGACATTATTAACCCTGCTGATGGAATCTGTTTGACC', 'Botryllus_schlosseri_TRINITY_DN1013_c0_g2::TRINITY_DN1013_c0_g2_i1::g.248881::m.248881': 'CTGATTCTTCTCATCGGCTCTCTCTGCTTCACTCTGACCCACGGCCTT...TCTGAAGGCCTACGCGATTGCCAAGACGGCCAAAGACTCTTACAACTGG', 'Botryllus_schlosseri_TRINITY_DN10166_c0_g2::TRINITY_DN10166_c0_g2_i1::g.27418::m.27418': 'GTACATTTGAAGAAAATGTCTGAATTGATCCATCACGAAAAAGCTTTC...GGTCATCTCGAACAAAATGCACAGAACCATCGTGATCAGAAGAGACTAC', 'Botryllus_schlosseri_TRINITY_DN1017_c0_g1::TRINITY_DN1017_c0_g1_i1::g.248877::m.248877': 'TGCCATCTTCCGGCACTCATGGCCAAGATTGACGAGGTGCACCACTCG...CCAAGTTAAGACCTTGCCCACTGAGAAGCATACCCAGCCGGAGTGCTAG', 'Botryllus_schlosseri_TRINITY_DN101_c0_g1::TRINITY_DN101_c0_g1_i1::g.75736::m.75736': 'GAGCGATATCACGCTCGCTCGTTCAGCAAGCGCGAACGGCTGTATCGG...GAACCGGACCTTCGTCGATCTACCTGAGTTCATAGATGAAATGGATCCG', 'Botryllus_schlosseri_TRINITY_DN10203_c0_g1::TRINITY_DN10203_c0_g1_i1::g.234676::m.234676': 'GAAGGGCGGTTCGTGCCGCACCTGGTGGTCCGCATCGACGCCGGCACG...GGTCAACACAGATAAGGTATCGGCTACCACGGCCATCGAGGTCTTGCGC', 'Botryllus_schlosseri_TRINITY_DN10220_c0_g2::TRINITY_DN10220_c0_g2_i1::g.234678::m.234678': 'GATCATCGGTATGGACTACCCGTGGAAATCGAGATGACTACAATGGAC...TCGAGTGGTGCTAAATGCTGCTACTGATACATTTCGTGGTATACTAGAT', 'Botryllus_schlosseri_TRINITY_DN10249_c0_g1::TRINITY_DN10249_c0_g1_i1::g.234674::m.234674': 'GAAGGACTCTCGTTGATCGAACGCCGCAACTGCATGGAGCTGTTCCGA...GGATCGTCTCGACAAGGTTTACGCCAGGCACAGCGACATGGTCCTGTTG', ...}, '/groups/pupko/ashermoshe/dorothee/ks_tmp.371cd0d9072862', 'codeml', False, 1, 100, 'fasttree', 'muscle', '/groups/pupko/ashermoshe/dorothee/wgd_ksd')
        kwargs = {}
    132
    133     def __len__(self):
    134         return self._size
    135

...........................................................................
/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/wgd/ks_distribution.py in analyse_family(family_id='GF_000012', family={'Botryllus_schlosseri_TRINITY_DN47590_c0_g1::TRINITY_DN47590_c0_g1_i1::g.253656::m.253656': 'MGETKSTITNLYGHNNRAKRRTQLRELAIRTRTEKSHLIAGDFNGIDS...ARRQKDQLRSDEKEMQKVRNRVIQNQQYLSDYRKIKKKIWKEKAESLHK', 'Botryllus_schlosseri_TRINITY_DN50048_c0_g1::TRINITY_DN50048_c0_g1_i1::g.248848::m.248848': 'GAVISIDFLKAYDSVDHSFLHNTLEEAGFGVKVRAFFKAIYQGGSAKV...SGMKGKIATPSYADDVTITLAKEEESTKALQIVAEFGKASGLQINRKKT', 'Botryllus_schlosseri_TRINITY_DN50048_c0_g2::TRINITY_DN50048_c0_g2_i1::g.248849::m.248849': 'QKTTSQFARGIIKTIFKKGDKEDIRNYRPITILNVDYKIISKVITNRIQKVLPTITHRHQFINPPNTIGDLNLLLREVTSDMRERSRGA', 'Botryllus_schlosseri_TRINITY_DN50048_c0_g3::TRINITY_DN50048_c0_g3_i1::g.248850::m.248850': 'AGDFNGIDDIELDRDPVNIRHDAADAKYSKRVMEVIGVTDAFRQVHGS...CNHMPCPFSDHGATTALVKLTDHRPRRPNTWKNNTKVYEMEAFETELEV', 'Botryllus_schlosseri_TRINITY_DN50048_c0_g4::TRINITY_DN50048_c0_g4_i1::g.248851::m.248851': 'KTWADKATDLGKLQLEAKADAEEALGKRPHLLCEKIKVRRDAVSITAIKDATGKTTEDPEEIRETVEEFYQKLYSKRETDKCTANSFHRYQDAKLSTR', 'Botryllus_schlosseri_TRINITY_DN50200_c0_g1::TRINITY_DN50200_c0_g1_i1::g.61823::m.61823': 'GEDGLSSELYMVNLDLMKKELTEVYNEIYEAQGTTTSLGRAVLKIIHK...KNYRPISLLNSDYKILSKILTNRLKQALPSITHQHQHVNPPKTIGQINL', 'Botryllus_schlosseri_TRINITY_DN52423_c0_g2::TRINITY_DN52423_c0_g2_i1::g.262149::m.262149': 'GKANISIGGKLGGNIRLGRGIKQGDPISMLLFTMATDPLLQRLNHDLD...DVNITLAHQADVNEALKIIQDFEEASALKLNKNKSKGITYHPKPPPGSK', 'Botryllus_schlosseri_TRINITY_DN52423_c0_g3::TRINITY_DN52423_c0_g3_i1::g.262150::m.262150': 'PPPGSKNVLKWVQSMEVLGHVINRHPPNDHETWNGLANKAKDLMREIK...YVATLKKMPINTRRELETAVTELLFGKSMRPDYRKLIQRREAGGIGLVD', 'Botryllus_schlosseri_TRINITY_DN58171_c0_g1::TRINITY_DN58171_c0_g1_i1::g.196397::m.196397': 'LEYEIVEFLFGKGKRPEYKKLVQQEIAGGREVKDIPTITDIIFIKPAV...RTDHQLALTLGWLRERPINNSRPHTWQPRQHWAEMAKIMKEMEYKRDYI', 'Botryllus_schlosseri_TRINITY_DN61293_c1_g1::TRINITY_DN61293_c1_g1_i1::g.134659::m.134659': 'AAFVMDLDGKIEYEKIIQARESGGLELVDIPTMTDLAFVKPALRYLQR...MKRYGMRKINNAIPHVFQPLQHWQETEKTMRSLGRQQQDIKSKRRERYR', ...}, nucleotide={'Botryllus_schlosseri_TRINITY_DN10080_c0_g1::TRINITY_DN10080_c0_g1_i1::g.308497::m.308497': 'CTTAATTGTGCTTCATTTTTTAGCTCTCAAATGGCTCGATCAACGAAA...CATGGCGGAAGGCACGAAGGCCGTCGCCAAGCTCGCTGCCAGCAAATAA', 'Botryllus_schlosseri_TRINITY_DN10087_c0_g1::TRINITY_DN10087_c0_g1_i1::g.308496::m.308496': 'ATGGCGGTGGTGACACTGTTCTCGGTGGGGCATTCTAGGGAGGTGCGG...CTACCTCGACCCTCGGCTAGACCAACCATGGCCCCGCGTCGCGAAGGCC', 'Botryllus_schlosseri_TRINITY_DN1013_c0_g1::TRINITY_DN1013_c0_g1_i1::g.248880::m.248880': 'TGGGCAGTCCTGACTGGCACTATCAAGAATAACAGCAACATCCAATGC...GAAAAACTATGGGGACATTATTAACCCTGCTGATGGAATCTGTTTGACC', 'Botryllus_schlosseri_TRINITY_DN1013_c0_g2::TRINITY_DN1013_c0_g2_i1::g.248881::m.248881': 'CTGATTCTTCTCATCGGCTCTCTCTGCTTCACTCTGACCCACGGCCTT...TCTGAAGGCCTACGCGATTGCCAAGACGGCCAAAGACTCTTACAACTGG', 'Botryllus_schlosseri_TRINITY_DN10166_c0_g2::TRINITY_DN10166_c0_g2_i1::g.27418::m.27418': 'GTACATTTGAAGAAAATGTCTGAATTGATCCATCACGAAAAAGCTTTC...GGTCATCTCGAACAAAATGCACAGAACCATCGTGATCAGAAGAGACTAC', 'Botryllus_schlosseri_TRINITY_DN1017_c0_g1::TRINITY_DN1017_c0_g1_i1::g.248877::m.248877': 'TGCCATCTTCCGGCACTCATGGCCAAGATTGACGAGGTGCACCACTCG...CCAAGTTAAGACCTTGCCCACTGAGAAGCATACCCAGCCGGAGTGCTAG', 'Botryllus_schlosseri_TRINITY_DN101_c0_g1::TRINITY_DN101_c0_g1_i1::g.75736::m.75736': 'GAGCGATATCACGCTCGCTCGTTCAGCAAGCGCGAACGGCTGTATCGG...GAACCGGACCTTCGTCGATCTACCTGAGTTCATAGATGAAATGGATCCG', 'Botryllus_schlosseri_TRINITY_DN10203_c0_g1::TRINITY_DN10203_c0_g1_i1::g.234676::m.234676': 'GAAGGGCGGTTCGTGCCGCACCTGGTGGTCCGCATCGACGCCGGCACG...GGTCAACACAGATAAGGTATCGGCTACCACGGCCATCGAGGTCTTGCGC', 'Botryllus_schlosseri_TRINITY_DN10220_c0_g2::TRINITY_DN10220_c0_g2_i1::g.234678::m.234678': 'GATCATCGGTATGGACTACCCGTGGAAATCGAGATGACTACAATGGAC...TCGAGTGGTGCTAAATGCTGCTACTGATACATTTCGTGGTATACTAGAT', 'Botryllus_schlosseri_TRINITY_DN10249_c0_g1::TRINITY_DN10249_c0_g1_i1::g.234674::m.234674': 'GAAGGACTCTCGTTGATCGAACGCCGCAACTGCATGGAGCTGTTCCGA...GGATCGTCTCGACAAGGTTTACGCCAGGCACAGCGACATGGTCCTGTTG', ...}, tmp='/groups/pupko/ashermoshe/dorothee/ks_tmp.371cd0d9072862', codeml=<wgd.codeml.Codeml object>, preserve=False, times=1, min_length=100, method='fasttree', aligner='muscle', output_dir='/groups/pupko/ashermoshe/dorothee/wgd_ksd')
    298         logging.debug("Distance will be in Ks units!")
    299         clustering, pairwise_distances, tree_path = _weighting(
    300                 results_dict, msa=msa_path_protein, method="alc")
    301     else:
    302         clustering, pairwise_distances, tree_path = _weighting(
--> 303                 results_dict, msa=msa_path_protein, method=method)
        results_dict = {'Ka':                                                 ...

[94 rows x 94 columns], 'Ks':                                                 ...

[94 rows x 94 columns], 'Omega':                                                 ...

[94 rows x 94 columns]}
        msa_path_protein = '/groups/pupko/ashermoshe/dorothee/ks_tmp.371cd0d9072862/GF_000012.fasta.msa'
        method = 'fasttree'
    304     if clustering is not None:
    305         out = _calculate_weighted_ks(
    306                 clustering, results_dict, pairwise_distances, family_id
    307         )

...........................................................................
/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/wgd/ks_distribution.py in _weighting(pairwise_estimates={'Ka':                                                 ...      

[94 rows x 94 columns], 'Ks':                                                 ...

[94 rows x 94 columns], 'Omega':                                                 ...

[94 rows x 94 columns]}, msa='/groups/pupko/ashermoshe/dorothee/ks_tmp.371cd0d9072862/GF_000012.fasta.msa', method='fasttree')
     93     elif method == 'fasttree':
     94         # FastTree tree construction
     95         logging.debug('Constructing phylogenetic tree with FastTree')
     96         tree_path = run_fasttree(msa)
     97         clustering, pairwise_distances = phylogenetic_tree_to_cluster_format(
---> 98                 tree_path, pairwise_estimates['Ks'])
        tree_path = '/groups/pupko/ashermoshe/dorothee/ks_tmp.371cd0d9072862/GF_000012.fasta.msa.nw'
        pairwise_estimates = {'Ka':                                                 ...

[94 rows x 94 columns], 'Ks':                                                 ...

[94 rows x 94 columns], 'Omega':                                                 ...

[94 rows x 94 columns]}
     99
    100     else:
    101         # Average linkage clustering based on Ks
    102         logging.debug('Performing average linkage clustering on Ks values.')

...........................................................................
/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/wgd/phy.py in phylogenetic_tree_to_cluster_format(tree='/groups/pupko/ashermoshe/dorothee/ks_tmp.371cd0d9072862/GF_000012.fasta.msa.nw', pairwise_estimates=                                                ...

[94 rows x 94 columns])
    118         (only the index is used)
    119     :return: clustering data structure, pairwise distances dictionary
    120     """
    121     id_map = {
    122         pairwise_estimates.index[i]: i for i in range(len(pairwise_estimates))}
--> 123     t = Tree(tree)
        t = undefined
        tree = '/groups/pupko/ashermoshe/dorothee/ks_tmp.371cd0d9072862/GF_000012.fasta.msa.nw'
    124
    125     # midpoint rooting
    126     midpoint = t.get_midpoint_outgroup()
    127     if not midpoint:  # midpoint = None when their are only two leaves

...........................................................................
/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/ete3/coretype/tree.py in __init__(self=Tree node '' (-0x7ffff800877d65c4), newick='/groups/pupko/ashermoshe/dorothee/ks_tmp.371cd0d9072862/GF_000012.fasta.msa.nw', format=0, dist=None, support=None, name=None, quoted_node_names=False)
    206
    207         # Initialize tree
    208         if newick is not None:
    209             self._dist = 0.0
    210             read_newick(newick, root_node = self, format=format,
--> 211                         quoted_names=quoted_node_names)
        quoted_node_names = False
    212
    213
    214     def __nonzero__(self):
    215         return True

...........................................................................
/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/ete3/parser/newick.py in read_newick(newick='/groups/pupko/ashermoshe/dorothee/ks_tmp.371cd0d9072862/GF_000012.fasta.msa.nw', root_node=Tree node '' (-0x7ffff800877d65c4), format=0, quoted_names=False)
    244         nw = nw.strip()
    245         if not nw.startswith('(') and nw.endswith(';'):
    246             #return _read_node_data(nw[:-1], root_node, "single", matcher, format)
    247             return _read_newick_from_string(nw, root_node, matcher, format, quoted_names)
    248         elif not nw.startswith('(') or not nw.endswith(';'):
--> 249             raise NewickError('Unexisting tree file or Malformed newick tree structure.')
    250         else:
    251             return _read_newick_from_string(nw, root_node, matcher, format, quoted_names)
    252
    253     else:

NewickError: Unexisting tree file or Malformed newick tree structure.
You may want to check other newick loading flags like 'format' or 'quoted_node_names'.
___________________________________________________________________________
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/joblib/parallel.py", line 699, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/multiprocessing/pool.py", line 644, in get
    raise self._value
joblib.my_exceptions.TransportableException: TransportableException
___________________________________________________________________________
NewickError                                        Wed Feb 27 18:10:41 2019
PID: 62472             Python 3.6.8: /share/apps/anaconda3-5.1.0/bin/python
...........................................................................
arzwa commented 5 years ago

Hi, I shortened your error message to the relevant part (it' so awfully long because of the multiprocessing). There seems to be an error in the newick tree for GF000012; You can find this here /groups/pupko/ashermoshe/dorothee/ks_tmp.371cd0d9072862/GF_000012.fasta.msa.nw, could you show me that file?

Also, can I close the issue opened yesterday by the user dvory-tau, since it seems this is the same use case and you managed to fix the problems you (or someone else) had yesterday?

arzwa commented 5 years ago

Oh, I see what is probably causing the problem. Your gene (transcript) names contain colons ::. That will obviously mess up parsing a newick file, since colons are used to specify branch lengths... Please replace the colons in the transcript identifiers (e.g. sed 's/::/../g' Botryllus_schlosseri.fas > Botryllus_schlosseri_renamed.fas and sed 's/::/../g' schlosseri.mcl.fas > schlosseri_renamed.mcl). Can you let me know if that fixes it?

asher-616 commented 5 years ago

Thank you for your quick reply. I tried the suggestion, and I think it did fix the issue, but I now have a new error

2019-03-06 14:40:21: INFO       Making results data frame
Traceback (most recent call last):
  File "/share/apps/anaconda3-5.1.0/bin/wgd", line 11, in <module>
    sys.exit(cli())
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/wgd_cli.py", line 545, in ksd
    max_pairwise=max_pairwise
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/wgd_cli.py", line 686, in ksd_
    max_pairwise=max_pairwise,
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/wgd/ks_distribution.py", line 654, in ks_analysis_paranome
    results_frame = pd.concat([results_frame, df], sort=True)
TypeError: concat() got an unexpected keyword argument 'sort'
arzwa commented 5 years ago

Hi, I believe this might be an issue with the version of the pandas python package. Could you check your pandas version using pip freeze | grep pandas? I have pandas==0.24.1 and have no troubles.

arzwa commented 5 years ago

In the meantime I have updated the setup.py file to include package version numbers, so re-installing wgd should resolve most issues stemming from incompatible dependency versions.

arzwa commented 5 years ago

Hi, have these problems been resolved in the meantime?

asher-616 commented 5 years ago

Hi, The problems have been resolved. Thank you for the support

On Thu, Mar 14, 2019 at 10:28 AM Arthur Zwaenepoel notifications@github.com wrote:

Hi, have these problems been resolved in the meantime?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/arzwa/wgd/issues/14#issuecomment-472750726, or mute the thread https://github.com/notifications/unsubscribe-auth/At2xSfXBjCaP9CNieYTUaWllYDZEryACks5vWggYgaJpZM4bWM8- .

-- Asher Moshe Pupko lab, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel

U201412486 commented 4 years ago

Hi, I get the error below when I run the command wgd ksd a.mcl a.cds. Can you give me some advice?

[119 rows x 119 columns], 'Ks':                      evm.model.Seq1.1226  ...  e......              0.0000

[119 rows x 119 columns], 'Omega':                      evm.model.Seq1.1226  ...  e......              0.0000

[119 rows x 119 columns]}
     92 
     93     elif method == 'fasttree':
     94         # FastTree tree construction
     95         logging.debug('Constructing phylogenetic tree with FastTree')

...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd/phy.py in phylogenetic_tree_to_cluster_format(tree='/project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026.fasta.msa.nw', pairwise_estimates=                     evm.model.Seq1.1226  ...  e......              0.0000

[119 rows x 119 columns])
    118         (only the index is used)
    119     :return: clustering data structure, pairwise distances dictionary
    120     """
    121     id_map = {
    122         pairwise_estimates.index[i]: i for i in range(len(pairwise_estimates))}
--> 123     t = Tree(tree)
        t = undefined
        tree = '/project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026.fasta.msa.nw'
    124 
    125     # midpoint rooting
    126     midpoint = t.get_midpoint_outgroup()
    127     if not midpoint:  # midpoint = None when their are only two leaves

...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/ete3-3.1.1-py3.7.egg/ete3/coretype/tree.py in __init__(self=Tree node '' (0x2b936ceba5c), newick='/project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026.fasta.msa.nw', format=0, dist=None, support=None, name=None, quoted_node_names=False)
    206 
    207         # Initialize tree
    208         if newick is not None:
    209             self._dist = 0.0
    210             read_newick(newick, root_node = self, format=format,
--> 211                         quoted_names=quoted_node_names)
        quoted_node_names = False
    212 
    213 
    214     def __nonzero__(self):
    215         return True

...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/ete3-3.1.1-py3.7.egg/ete3/parser/newick.py in read_newick(newick='/project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026.fasta.msa.nw', root_node=Tree node '' (0x2b936ceba5c), format=0, quoted_names=False)
    244         nw = nw.strip()
    245         if not nw.startswith('(') and nw.endswith(';'):
    246             #return _read_node_data(nw[:-1], root_node, "single", matcher, format)
    247             return _read_newick_from_string(nw, root_node, matcher, format, quoted_names)
    248         elif not nw.startswith('(') or not nw.endswith(';'):
--> 249             raise NewickError('Unexisting tree file or Malformed newick tree structure.')
    250         else:
    251             return _read_newick_from_string(nw, root_node, matcher, format, quoted_names)
    252 
    253     else:

NewickError: Unexisting tree file or Malformed newick tree structure.
You may want to check other newick loading flags like 'format' or 'quoted_node_names'.
___________________________________________________________________________
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/joblib-0.11-py3.7.egg/joblib/parallel.py", line 699, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "/software/Python-3.7.3/sxh_configure/lib/python3.7/multiprocessing/pool.py", line 657, in get
    raise self._value
joblib.my_exceptions.TransportableException: TransportableException
___________________________________________________________________________
NewickError                                        Sat Apr  4 23:42:21 2020
PID: 27129Python 3.7.3: /software/Python-3.7.3/sxh_configure/bin/python3
...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/joblib-0.11-py3.7.egg/joblib/parallel.py in __call__(self=<joblib.parallel.BatchedCalls object>)
    126     def __init__(self, iterator_slice):
    127         self.items = list(iterator_slice)
    128         self._size = len(self.items)
    129 
    130     def __call__(self):
--> 131         return [func(*args, **kwargs) for func, args, kwargs in self.items]
        self.items = [(<function analyse_family>, ('GF_000026', {'evm.model.Seq1.1226': 'MGGGERSLLHLPHLHIHQGKKKTSDVPKGYLAIKVGQEEEEQQRFVVP...KEAEEVYGFHHKGTITIPCHVEQFRSIQGKIDKHHHNHHHHIHVPCFRA', 'evm.model.Seq1.1709': 'MRKFRGFVLKHRVTTLFRCMFRQRRRETARYHRLDQLPSWNGPTKSFS...IYINHPLFSELLREAEEEYGFNHPGGITIPCRISEFERVQTRIKQCRVG', 'evm.model.Seq1.1906': 'MGGGERSLLHLPHLHIHQGKKKTSDVPKGYLAIKVGQEEEEQQRFVVP...KEAEEVYGFHHKGTITIPCHVEQFRSIQGKIDKHHHNHHHHIHVPCFRA', 'evm.model.Seq1.3716': 'MSSKIGKSSKIRCIVRISQMLRQWKKRSLISSSKRIAPDVPAGHVAIS...LRFVSRSGSGRSINIEDFQKSCHARYRNSVENFGDSRPLLGGSTEKSVC', 'evm.model.Seq1.3717': 'MAIRKSNKMPQAAILKQILKRCSSLGKKHGYDDEDHLPNDVPKGHFAV...YIVPISFLTHPEFQCLLRQAEEEFGFDHDMGITIPCEEVVFRSLTSMLR', 'evm.model.Seq1.3718': 'MGIRKSNKLPQVILFKQIMKRCSRLAKKQSYGDVPKGHFAVYVGENRT...HPEFQCLLRCAEEEFGFDHDMGITIPCEEFIFQSMTSMLRYEKKKKSEY', 'evm.model.Seq1.3722': 'MAIRKSNKLPQVVLLKKILKRCSSLTKKHGYGDLDHIPNDVPKGHFAV...YIVPISFLNHPEFQCLLRCAEEEFGFDHDMGITIPCEEVIFQSLTSMLR', 'evm.model.Seq1.3723': 'MGIRKSNKLSQAAVLKQILSCSSLGMKQGYDDEFHLPIDVPKGHFAVY...FIVPISFLTHPEFQCLLQRAEEEFGFNHDMGITIPCEEAVFRSLTTMLR', 'evm.model.Seq1.3724': 'MAIRKSNKLPQAPILKQILKRCSSLGKKHVYDDDEDHLPVDVPKGYFT...YIIPISFLTHPDFQCLLRCAEEEFGFDHDMGITIPCEELVFQSLTSMIR', 'evm.model.Seq1.3725': 'MAIRKSNKLTQGAVVKQIIKRCSSLGKKHGYDDGDHLPMDVPKGHFAV...FWTHPEFQCLLHCAEEEFGFDHDMGITIPCEEVVFPITNFHTLVVEVSD', ...}, {'evm.model.Seq1.1': 'ATGTGGGGTTTGGCTAATTTTGAAAGATCTGATATCTCTCTCTCTTTA...TTGGAACTTGGAAAGCAAGGAATACGTTTTGTATCTCAGAAGCAAAAAC', 'evm.model.Seq1.10': 'ATGGAAAGCAAGAGTGGAGAAGGAAAGGTTGTATGTGTAACAGGGGCA...GAAGGATACTGTCGAAAGCTTGATGGAGAAGAACTTCCTCCATATCTAA', 'evm.model.Seq1.100': 'ATGGTACTCTTGGTTGAAAAGATTAGCCATTTTCTGAAAAATCCCAAC...AGCTGAAAGAGGTAAATTTCTTGATAGCCTTCAGCACACTAGAAAGTAG', 'evm.model.Seq1.1000': 'ATGGATGAAGCAAAGGTTGTTGAAGCTAAGGAGGGAACTATCTCTGTA...AGCCGAGCAAGTATTTGAGAAGACCAAAGAGAAGTTCCCCATCTATTGA', 'evm.model.Seq1.1001': 'ATGGCGGACAAGGCAGTCACCATCCGTACTCGCAAGTTCATGACCAAC...GATCCGTGGTGTAAAGAAGACCAAGGCTGGTGATGCAAAGAAGAAATAA', 'evm.model.Seq1.1002.2.5dee1a2d': 'ATGACTGCTGCTCCTTTCTTGATTGAATCAAACCTCAAATATAACCCT...TCATCAGGAAGCACTAAGTTTTGCGCTACTTGTAGCTTTTAATTTGTGA', 'evm.model.Seq1.1003': 'ATGAATGTTGATCAGCATGGTTCCAGCTCTAGGTTATATGTAAGTTTG...AACCATAGCTATGTTGATGGAACAGGTTTTAAGAGTGGCAACACTTTAA', 'evm.model.Seq1.1004': 'ATGAATGGTCTTACACATACAGAACCAGAGTTTTCTGAATTTGTTGAA...TTCACTTCACAGAACAACATCCTTACCAGTTGATGCTGTTGATATATAG', 'evm.model.Seq1.1005': 'ATGATTCCTGCTTGTTTTAGTATTCCTCATTCTGAGGTTTCAAAAACT...GATTAGTGAGGGATTTTCTTTGTTATTGTATGCGTGGAGGAAGGATTGA', 'evm.model.Seq1.1006.2.5dee1a2d.1.5df2f16e': 'ATGAGAGTGGAGATATCTGATGATGAAGGGGCTGAAGAACCTTTGGTC...AGCCGTTGGTCAGATATTAATGCCTGGTTTGAGGAGCTCCGGTGAGTGA', ...}, '/project/comparative_genomic_/wgd/wgd_pipline/ks_tmp.3858b1da53b934', 'codeml', False, 1, 100, 'phyml', 'muscle', '/project/comparative_genomic_/wgd/wgd_pipline/_longest_ksd'), {})]
    132 
    133     def __len__(self):
    134         return self._size
    135 

...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/joblib-0.11-py3.7.egg/joblib/parallel.py in <listcomp>(.0=<list_iterator object>)
    126     def __init__(self, iterator_slice):
    127         self.items = list(iterator_slice)
    128         self._size = len(self.items)
    129 
    130     def __call__(self):
--> 131         return [func(*args, **kwargs) for func, args, kwargs in self.items]
        func = <function analyse_family>
        args = ('GF_000026', {'evm.model.Seq1.1226': 'MGGGERSLLHLPHLHIHQGKKKTSDVPKGYLAIKVGQEEEEQQRFVVP...KEAEEVYGFHHKGTITIPCHVEQFRSIQGKIDKHHHNHHHHIHVPCFRA', 'evm.model.Seq1.1709': 'MRKFRGFVLKHRVTTLFRCMFRQRRRETARYHRLDQLPSWNGPTKSFS...IYINHPLFSELLREAEEEYGFNHPGGITIPCRISEFERVQTRIKQCRVG', 'evm.model.Seq1.1906': 'MGGGERSLLHLPHLHIHQGKKKTSDVPKGYLAIKVGQEEEEQQRFVVP...KEAEEVYGFHHKGTITIPCHVEQFRSIQGKIDKHHHNHHHHIHVPCFRA', 'evm.model.Seq1.3716': 'MSSKIGKSSKIRCIVRISQMLRQWKKRSLISSSKRIAPDVPAGHVAIS...LRFVSRSGSGRSINIEDFQKSCHARYRNSVENFGDSRPLLGGSTEKSVC', 'evm.model.Seq1.3717': 'MAIRKSNKMPQAAILKQILKRCSSLGKKHGYDDEDHLPNDVPKGHFAV...YIVPISFLTHPEFQCLLRQAEEEFGFDHDMGITIPCEEVVFRSLTSMLR', 'evm.model.Seq1.3718': 'MGIRKSNKLPQVILFKQIMKRCSRLAKKQSYGDVPKGHFAVYVGENRT...HPEFQCLLRCAEEEFGFDHDMGITIPCEEFIFQSMTSMLRYEKKKKSEY', 'evm.model.Seq1.3722': 'MAIRKSNKLPQVVLLKKILKRCSSLTKKHGYGDLDHIPNDVPKGHFAV...YIVPISFLNHPEFQCLLRCAEEEFGFDHDMGITIPCEEVIFQSLTSMLR', 'evm.model.Seq1.3723': 'MGIRKSNKLSQAAVLKQILSCSSLGMKQGYDDEFHLPIDVPKGHFAVY...FIVPISFLTHPEFQCLLQRAEEEFGFNHDMGITIPCEEAVFRSLTTMLR', 'evm.model.Seq1.3724': 'MAIRKSNKLPQAPILKQILKRCSSLGKKHVYDDDEDHLPVDVPKGYFT...YIIPISFLTHPDFQCLLRCAEEEFGFDHDMGITIPCEELVFQSLTSMIR', 'evm.model.Seq1.3725': 'MAIRKSNKLTQGAVVKQIIKRCSSLGKKHGYDDGDHLPMDVPKGHFAV...FWTHPEFQCLLHCAEEEFGFDHDMGITIPCEEVVFPITNFHTLVVEVSD', ...}, {'evm.model.Seq1.1': 'ATGTGGGGTTTGGCTAATTTTGAAAGATCTGATATCTCTCTCTCTTTA...TTGGAACTTGGAAAGCAAGGAATACGTTTTGTATCTCAGAAGCAAAAAC', 'evm.model.Seq1.10': 'ATGGAAAGCAAGAGTGGAGAAGGAAAGGTTGTATGTGTAACAGGGGCA...GAAGGATACTGTCGAAAGCTTGATGGAGAAGAACTTCCTCCATATCTAA', 'evm.model.Seq1.100': 'ATGGTACTCTTGGTTGAAAAGATTAGCCATTTTCTGAAAAATCCCAAC...AGCTGAAAGAGGTAAATTTCTTGATAGCCTTCAGCACACTAGAAAGTAG', 'evm.model.Seq1.1000': 'ATGGATGAAGCAAAGGTTGTTGAAGCTAAGGAGGGAACTATCTCTGTA...AGCCGAGCAAGTATTTGAGAAGACCAAAGAGAAGTTCCCCATCTATTGA', 'evm.model.Seq1.1001': 'ATGGCGGACAAGGCAGTCACCATCCGTACTCGCAAGTTCATGACCAAC...GATCCGTGGTGTAAAGAAGACCAAGGCTGGTGATGCAAAGAAGAAATAA', 'evm.model.Seq1.1002.2.5dee1a2d': 'ATGACTGCTGCTCCTTTCTTGATTGAATCAAACCTCAAATATAACCCT...TCATCAGGAAGCACTAAGTTTTGCGCTACTTGTAGCTTTTAATTTGTGA', 'evm.model.Seq1.1003': 'ATGAATGTTGATCAGCATGGTTCCAGCTCTAGGTTATATGTAAGTTTG...AACCATAGCTATGTTGATGGAACAGGTTTTAAGAGTGGCAACACTTTAA', 'evm.model.Seq1.1004': 'ATGAATGGTCTTACACATACAGAACCAGAGTTTTCTGAATTTGTTGAA...TTCACTTCACAGAACAACATCCTTACCAGTTGATGCTGTTGATATATAG', 'evm.model.Seq1.1005': 'ATGATTCCTGCTTGTTTTAGTATTCCTCATTCTGAGGTTTCAAAAACT...GATTAGTGAGGGATTTTCTTTGTTATTGTATGCGTGGAGGAAGGATTGA', 'evm.model.Seq1.1006.2.5dee1a2d.1.5df2f16e': 'ATGAGAGTGGAGATATCTGATGATGAAGGGGCTGAAGAACCTTTGGTC...AGCCGTTGGTCAGATATTAATGCCTGGTTTGAGGAGCTCCGGTGAGTGA', ...}, '/project/comparative_genomic_/wgd/wgd_pipline/ks_tmp.3858b1da53b934', 'codeml', False, 1, 100, 'phyml', 'muscle', '/project/comparative_genomic_/wgd/wgd_pipline/_longest_ksd')
        kwargs = {}
    132 
    133     def __len__(self):
    134         return self._size
    135 

...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd/ks_distribution.py in analyse_family(family_id='GF_000026', family={'evm.model.Seq1.1226': 'MGGGERSLLHLPHLHIHQGKKKTSDVPKGYLAIKVGQEEEEQQRFVVP...KEAEEVYGFHHKGTITIPCHVEQFRSIQGKIDKHHHNHHHHIHVPCFRA', 'evm.model.Seq1.1709': 'MRKFRGFVLKHRVTTLFRCMFRQRRRETARYHRLDQLPSWNGPTKSFS...IYINHPLFSELLREAEEEYGFNHPGGITIPCRISEFERVQTRIKQCRVG', 'evm.model.Seq1.1906': 'MGGGERSLLHLPHLHIHQGKKKTSDVPKGYLAIKVGQEEEEQQRFVVP...KEAEEVYGFHHKGTITIPCHVEQFRSIQGKIDKHHHNHHHHIHVPCFRA', 'evm.model.Seq1.3716': 'MSSKIGKSSKIRCIVRISQMLRQWKKRSLISSSKRIAPDVPAGHVAIS...LRFVSRSGSGRSINIEDFQKSCHARYRNSVENFGDSRPLLGGSTEKSVC', 'evm.model.Seq1.3717': 'MAIRKSNKMPQAAILKQILKRCSSLGKKHGYDDEDHLPNDVPKGHFAV...YIVPISFLTHPEFQCLLRQAEEEFGFDHDMGITIPCEEVVFRSLTSMLR', 'evm.model.Seq1.3718': 'MGIRKSNKLPQVILFKQIMKRCSRLAKKQSYGDVPKGHFAVYVGENRT...HPEFQCLLRCAEEEFGFDHDMGITIPCEEFIFQSMTSMLRYEKKKKSEY', 'evm.model.Seq1.3722': 'MAIRKSNKLPQVVLLKKILKRCSSLTKKHGYGDLDHIPNDVPKGHFAV...YIVPISFLNHPEFQCLLRCAEEEFGFDHDMGITIPCEEVIFQSLTSMLR', 'evm.model.Seq1.3723': 'MGIRKSNKLSQAAVLKQILSCSSLGMKQGYDDEFHLPIDVPKGHFAVY...FIVPISFLTHPEFQCLLQRAEEEFGFNHDMGITIPCEEAVFRSLTTMLR', 'evm.model.Seq1.3724': 'MAIRKSNKLPQAPILKQILKRCSSLGKKHVYDDDEDHLPVDVPKGYFT...YIIPISFLTHPDFQCLLRCAEEEFGFDHDMGITIPCEELVFQSLTSMIR', 'evm.model.Seq1.3725': 'MAIRKSNKLTQGAVVKQIIKRCSSLGKKHGYDDGDHLPMDVPKGHFAV...FWTHPEFQCLLHCAEEEFGFDHDMGITIPCEEVVFPITNFHTLVVEVSD', ...}, nucleotide={'evm.model.Seq1.1': 'ATGTGGGGTTTGGCTAATTTTGAAAGATCTGATATCTCTCTCTCTTTA...TTGGAACTTGGAAAGCAAGGAATACGTTTTGTATCTCAGAAGCAAAAAC', 'evm.model.Seq1.10': 'ATGGAAAGCAAGAGTGGAGAAGGAAAGGTTGTATGTGTAACAGGGGCA...GAAGGATACTGTCGAAAGCTTGATGGAGAAGAACTTCCTCCATATCTAA', 'evm.model.Seq1.100': 'ATGGTACTCTTGGTTGAAAAGATTAGCCATTTTCTGAAAAATCCCAAC...AGCTGAAAGAGGTAAATTTCTTGATAGCCTTCAGCACACTAGAAAGTAG', 'evm.model.Seq1.1000': 'ATGGATGAAGCAAAGGTTGTTGAAGCTAAGGAGGGAACTATCTCTGTA...AGCCGAGCAAGTATTTGAGAAGACCAAAGAGAAGTTCCCCATCTATTGA', 'evm.model.Seq1.1001': 'ATGGCGGACAAGGCAGTCACCATCCGTACTCGCAAGTTCATGACCAAC...GATCCGTGGTGTAAAGAAGACCAAGGCTGGTGATGCAAAGAAGAAATAA', 'evm.model.Seq1.1002.2.5dee1a2d': 'ATGACTGCTGCTCCTTTCTTGATTGAATCAAACCTCAAATATAACCCT...TCATCAGGAAGCACTAAGTTTTGCGCTACTTGTAGCTTTTAATTTGTGA', 'evm.model.Seq1.1003': 'ATGAATGTTGATCAGCATGGTTCCAGCTCTAGGTTATATGTAAGTTTG...AACCATAGCTATGTTGATGGAACAGGTTTTAAGAGTGGCAACACTTTAA', 'evm.model.Seq1.1004': 'ATGAATGGTCTTACACATACAGAACCAGAGTTTTCTGAATTTGTTGAA...TTCACTTCACAGAACAACATCCTTACCAGTTGATGCTGTTGATATATAG', 'evm.model.Seq1.1005': 'ATGATTCCTGCTTGTTTTAGTATTCCTCATTCTGAGGTTTCAAAAACT...GATTAGTGAGGGATTTTCTTTGTTATTGTATGCGTGGAGGAAGGATTGA', 'evm.model.Seq1.1006.2.5dee1a2d.1.5df2f16e': 'ATGAGAGTGGAGATATCTGATGATGAAGGGGCTGAAGAACCTTTGGTC...AGCCGTTGGTCAGATATTAATGCCTGGTTTGAGGAGCTCCGGTGAGTGA', ...}, tmp='/project/comparative_genomic_/wgd/wgd_pipline/ks_tmp.3858b1da53b934', codeml=<wgd.codeml.Codeml object>, preserve=False, times=1, min_length=100, method='phyml', aligner='muscle', output_dir='/project/comparative_genomic_/wgd/wgd_pipline/_longest_ksd')
    300         logging.debug("Distance will be in Ks units!")
    301         clustering, pairwise_distances, tree_path = _weighting(
    302                 results_dict, msa=msa_path_protein, method="alc")
    303     else:
    304         clustering, pairwise_distances, tree_path = _weighting(
--> 305                 results_dict, msa=msa_path_protein, method=method)
        results_dict = {'Ka':                      evm.model.Seq1.1226  ...  e......              0.0000

[119 rows x 119 columns], 'Ks':                      evm.model.Seq1.1226  ...  e......              0.0000

[119 rows x 119 columns], 'Omega':                      evm.model.Seq1.1226  ...  e......              0.0000

[119 rows x 119 columns]}
        msa_path_protein = '/project/comparative_genomic_/wg...pipline/ks_tmp.3858b1da53b934/GF_000026.fasta.msa'
        method = 'phyml'
    306     if clustering is not None:
    307         out = _calculate_weighted_ks(
    308                 clustering, results_dict, pairwise_distances, family_id
    309         )

...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd/ks_distribution.py in _weighting(pairwise_estimates={'Ka':                      evm.model.Seq1.1226  ...  e......              0.0000

[119 rows x 119 columns], 'Ks':                      evm.model.Seq1.1226  ...  e......              0.0000

[119 rows x 119 columns], 'Omega':                      evm.model.Seq1.1226  ...  e......              0.0000

[119 rows x 119 columns]}, msa='/project/comparative_genomic_/wg...pipline/ks_tmp.3858b1da53b934/GF_000026.fasta.msa', method='phyml')
     86     if method == 'phyml':
     87         # PhyML tree construction
     88         logging.debug('Constructing phylogenetic tree with PhyML')
     89         tree_path = run_phyml(msa)
     90         clustering, pairwise_distances = phylogenetic_tree_to_cluster_format(
---> 91                 tree_path, pairwise_estimates['Ks'])
        tree_path = '/project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026.fasta.msa.nw'
        pairwise_estimates = {'Ka':                      evm.model.Seq1.1226  ...  e......              0.0000

[119 rows x 119 columns], 'Ks':                      evm.model.Seq1.1226  ...  e......              0.0000

[119 rows x 119 columns], 'Omega':                      evm.model.Seq1.1226  ...  e......              0.0000

[119 rows x 119 columns]}
     92 
     93     elif method == 'fasttree':
     94         # FastTree tree construction
     95         logging.debug('Constructing phylogenetic tree with FastTree')

...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd/phy.py in phylogenetic_tree_to_cluster_format(tree='/project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026.fasta.msa.nw', pairwise_estimates=                     evm.model.Seq1.1226  ...  e......              0.0000

[119 rows x 119 columns])
    118         (only the index is used)
    119     :return: clustering data structure, pairwise distances dictionary
    120     """
    121     id_map = {
    122         pairwise_estimates.index[i]: i for i in range(len(pairwise_estimates))}
--> 123     t = Tree(tree)
        t = undefined
        tree = '/project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026.fasta.msa.nw'
    124 
    125     # midpoint rooting
    126     midpoint = t.get_midpoint_outgroup()
    127     if not midpoint:  # midpoint = None when their are only two leaves

...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/ete3-3.1.1-py3.7.egg/ete3/coretype/tree.py in __init__(self=Tree node '' (0x2b936ceba5c), newick='/project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026.fasta.msa.nw', format=0, dist=None, support=None, name=None, quoted_node_names=False)
    206 
    207         # Initialize tree
    208         if newick is not None:
    209             self._dist = 0.0
    210             read_newick(newick, root_node = self, format=format,
--> 211                         quoted_names=quoted_node_names)
        quoted_node_names = False
    212 
    213 
    214     def __nonzero__(self):
    215         return True

...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/ete3-3.1.1-py3.7.egg/ete3/parser/newick.py in read_newick(newick='/project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026.fasta.msa.nw', root_node=Tree node '' (0x2b936ceba5c), format=0, quoted_names=False)
    244         nw = nw.strip()
    245         if not nw.startswith('(') and nw.endswith(';'):
    246             #return _read_node_data(nw[:-1], root_node, "single", matcher, format)
    247             return _read_newick_from_string(nw, root_node, matcher, format, quoted_names)
    248         elif not nw.startswith('(') or not nw.endswith(';'):
--> 249             raise NewickError('Unexisting tree file or Malformed newick tree structure.')
    250         else:
    251             return _read_newick_from_string(nw, root_node, matcher, format, quoted_names)
    252 
    253     else:

NewickError: Unexisting tree file or Malformed newick tree structure.
You may want to check other newick loading flags like 'format' or 'quoted_node_names'.
___________________________________________________________________________

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/software/wgd/wgd/wgd", line 11, in <module>
    load_entry_point('wgd==1.1', 'console_scripts', 'wgd')()
  File "/.local/lib/python3.7/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/.local/lib/python3.7/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/.local/lib/python3.7/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/.local/lib/python3.7/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/.local/lib/python3.7/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd_cli.py", line 632, in ksd
  File "/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd_cli.py", line 773, in ksd_
  File "/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd/ks_distribution.py", line 645, in ks_analysis_paranome
  File "/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/joblib-0.11-py3.7.egg/joblib/parallel.py", line 789, in __call__
    self.retrieve()
  File "/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/joblib-0.11-py3.7.egg/joblib/parallel.py", line 740, in retrieve
    raise exception
joblib.my_exceptions.JoblibNewickError: JoblibNewickError
___________________________________________________________________________
Multiprocessing exception:
...........................................................................
/software/wgd/wgd/wgd in <module>()
      6 from pkg_resources import load_entry_point
      7 
      8 if __name__ == '__main__':
      9     sys.argv[0] = re.sub(r'(-script\.pyw?|\.exe)?$', '', sys.argv[0])
     10     sys.exit(
---> 11         load_entry_point('wgd==1.1', 'console_scripts', 'wgd')()
     12     )

...........................................................................
/.local/lib/python3.7/site-packages/click/core.py in __call__(self=<click.core.Group object>, *args=(), **kwargs={})
    759             echo('Aborted!', file=sys.stderr)
    760             sys.exit(1)
    761 
    762     def __call__(self, *args, **kwargs):
    763         """Alias for :meth:`main`."""
--> 764         return self.main(*args, **kwargs)
        self.main = <bound method BaseCommand.main of <click.core.Group object>>
        args = ()
        kwargs = {}
    765 
    766 
    767 class Command(BaseCommand):
    768     """Commands are the basic building block of command line interfaces in

...........................................................................
/.local/lib/python3.7/site-packages/click/core.py in main(self=<click.core.Group object>, args=['ksd', './_longest_blast/.longest.mRNA.cds.fasta.blast.tsv.mcl', './.longest.mRNA.cds.fasta', '-a', 'muscle', '-n', '20', '-w', 'phyml', '-o', './_longest_ksd'], prog_name='wgd', complete_var=None, standalone_mode=True, **extra={})
    712         _bashcomplete(self, prog_name, complete_var)
    713 
    714         try:
    715             try:
    716                 with self.make_context(prog_name, args, **extra) as ctx:
--> 717                     rv = self.invoke(ctx)
        rv = undefined
        self.invoke = <bound method MultiCommand.invoke of <click.core.Group object>>
        ctx = <click.core.Context object>
    718                     if not standalone_mode:
    719                         return rv
    720                     # it's not safe to `ctx.exit(rv)` here!
    721                     # note that `rv` may actually contain data like "1" which

...........................................................................
/.local/lib/python3.7/site-packages/click/core.py in invoke(self=<click.core.Group object>, ctx=<click.core.Context object>)
   1132                 cmd_name, cmd, args = self.resolve_command(ctx, args)
   1133                 ctx.invoked_subcommand = cmd_name
   1134                 Command.invoke(self, ctx)
   1135                 sub_ctx = cmd.make_context(cmd_name, args, parent=ctx)
   1136                 with sub_ctx:
-> 1137                     return _process_result(sub_ctx.command.invoke(sub_ctx))
        _process_result = <function MultiCommand.invoke.<locals>._process_result>
        sub_ctx.command.invoke = <bound method Command.invoke of <click.core.Command object>>
        sub_ctx = <click.core.Context object>
   1138 
   1139         # In chain mode we create the contexts step by step, but after the
   1140         # base command has been invoked.  Because at that point we do not
   1141         # know the subcommands yet, the invoked subcommand attribute is

...........................................................................
/.local/lib/python3.7/site-packages/click/core.py in invoke(self=<click.core.Command object>, ctx=<click.core.Context object>)
    951         """Given a context, this invokes the attached callback (if it exists)
    952         in the right way.
    953         """
    954         _maybe_show_deprecated_notice(self)
    955         if self.callback is not None:
--> 956             return ctx.invoke(self.callback, **ctx.params)
        ctx.invoke = <bound method Context.invoke of <click.core.Context object>>
        self.callback = <function ksd>
        ctx.params = {'aligner': 'muscle', 'gene_families': './_longest_blast/.longest.mRNA.cds.fasta.blast.tsv.mcl', 'ignore_prefixes': False, 'max_pairwise': 10000, 'min_msa_length': 100, 'n_threads': 20, 'one_v_one': False, 'output_directory': './_longest_ksd', 'pairwise': False, 'preserve': False, ...}
    957 
    958 
    959 class MultiCommand(Command):
    960     """A multi command is the basic implementation of a command that

...........................................................................
/.local/lib/python3.7/site-packages/click/core.py in invoke(*args=(), **kwargs={'aligner': 'muscle', 'gene_families': './_longest_blast/.longest.mRNA.cds.fasta.blast.tsv.mcl', 'ignore_prefixes': False, 'max_pairwise': 10000, 'min_msa_length': 100, 'n_threads': 20, 'one_v_one': False, 'output_directory': './_longest_ksd', 'pairwise': False, 'preserve': False, ...})
    550                     kwargs[param.name] = param.get_default(ctx)
    551 
    552         args = args[2:]
    553         with augment_usage_errors(self):
    554             with ctx:
--> 555                 return callback(*args, **kwargs)
        callback = <function ksd>
        args = ()
        kwargs = {'aligner': 'muscle', 'gene_families': './_longest_blast/.longest.mRNA.cds.fasta.blast.tsv.mcl', 'ignore_prefixes': False, 'max_pairwise': 10000, 'min_msa_length': 100, 'n_threads': 20, 'one_v_one': False, 'output_directory': './_longest_ksd', 'pairwise': False, 'preserve': False, ...}
    556 
    557     def forward(*args, **kwargs):
    558         """Similar to :meth:`invoke` but fills in default keyword
    559         arguments from the current context if the other command expects

...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd_cli.py in ksd(gene_families='./_longest_blast/.longest.mRNA.cds.fasta.blast.tsv.mcl', sequences=('./.longest.mRNA.cds.fasta',), output_directory='./_longest_ksd', protein_sequences=None, tmp_dir=None, aligner='muscle', times=1, min_msa_length=100, n_threads=20, wm='phyml', pairwise=False, max_pairwise=10000, ignore_prefixes=False, one_v_one=False, preserve=False)
    627             tmp_dir, aligner, codeml='codeml',
    628             times=times, min_msa_length=min_msa_length,
    629             ignore_prefixes=ignore_prefixes, one_v_one=one_v_one,
    630             preserve=preserve, n_threads=n_threads,
    631             weighting_method=wm, pairwise=pairwise,
--> 632             max_pairwise=max_pairwise
        max_pairwise = 10000
    633     )
    634 
    635 
    636 def ksd_(

...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd_cli.py in ksd_(gene_families='/project/comparative_genomic_/wg..._blast/.longest.mRNA.cds.fasta.blast.tsv.mcl', sequences=('./.longest.mRNA.cds.fasta',), output_directory='/project/comparative_genomic_/wgd/wgd_pipline/_longest_ksd', protein_sequences=None, tmp_dir='/project/comparative_genomic_/wgd/wgd_pipline/ks_tmp.3858b1da53b934', aligner='muscle', codeml='codeml', times=1, min_msa_length=100, ignore_prefixes=False, one_v_one=False, pairwise=False, preserve=False, n_threads=20, weighting_method='phyml', max_pairwise=10000)
    768             ignore_prefixes=ignore_prefixes,
    769             n_threads=n_threads,
    770             min_length=min_msa_length,
    771             method=weighting_method,
    772             pairwise=pairwise,
--> 773             max_pairwise=max_pairwise,
        max_pairwise = 10000
    774         )
    775         results.round(5).to_csv(os.path.join(
    776             output_directory, '{}.ks.tsv'.format(base)), sep='\t')
    777 

...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd/ks_distribution.py in ks_analysis_paranome(nucleotide_sequences={'evm.model.Seq1.1': 'ATGTGGGGTTTGGCTAATTTTGAAAGATCTGATATCTCTCTCTCTTTA...TTGGAACTTGGAAAGCAAGGAATACGTTTTGTATCTCAGAAGCAAAAAC', 'evm.model.Seq1.10': 'ATGGAAAGCAAGAGTGGAGAAGGAAAGGTTGTATGTGTAACAGGGGCA...GAAGGATACTGTCGAAAGCTTGATGGAGAAGAACTTCCTCCATATCTAA', 'evm.model.Seq1.100': 'ATGGTACTCTTGGTTGAAAAGATTAGCCATTTTCTGAAAAATCCCAAC...AGCTGAAAGAGGTAAATTTCTTGATAGCCTTCAGCACACTAGAAAGTAG', 'evm.model.Seq1.1000': 'ATGGATGAAGCAAAGGTTGTTGAAGCTAAGGAGGGAACTATCTCTGTA...AGCCGAGCAAGTATTTGAGAAGACCAAAGAGAAGTTCCCCATCTATTGA', 'evm.model.Seq1.1001': 'ATGGCGGACAAGGCAGTCACCATCCGTACTCGCAAGTTCATGACCAAC...GATCCGTGGTGTAAAGAAGACCAAGGCTGGTGATGCAAAGAAGAAATAA', 'evm.model.Seq1.1002.2.5dee1a2d': 'ATGACTGCTGCTCCTTTCTTGATTGAATCAAACCTCAAATATAACCCT...TCATCAGGAAGCACTAAGTTTTGCGCTACTTGTAGCTTTTAATTTGTGA', 'evm.model.Seq1.1003': 'ATGAATGTTGATCAGCATGGTTCCAGCTCTAGGTTATATGTAAGTTTG...AACCATAGCTATGTTGATGGAACAGGTTTTAAGAGTGGCAACACTTTAA', 'evm.model.Seq1.1004': 'ATGAATGGTCTTACACATACAGAACCAGAGTTTTCTGAATTTGTTGAA...TTCACTTCACAGAACAACATCCTTACCAGTTGATGCTGTTGATATATAG', 'evm.model.Seq1.1005': 'ATGATTCCTGCTTGTTTTAGTATTCCTCATTCTGAGGTTTCAAAAACT...GATTAGTGAGGGATTTTCTTTGTTATTGTATGCGTGGAGGAAGGATTGA', 'evm.model.Seq1.1006.2.5dee1a2d.1.5df2f16e': 'ATGAGAGTGGAGATATCTGATGATGAAGGGGCTGAAGAACCTTTGGTC...AGCCGTTGGTCAGATATTAATGCCTGGTTTGAGGAGCTCCGGTGAGTGA', ...}, protein_sequences={'evm.model.Seq1.1': 'MWGLANFERSDISLSLFIKPVSWFCCFLTKLSAFWCSRRICYCSLGLP...HPSDNSWHRGTVIEVFEGSSVVSVALDDGKKKNLELGKQGIRFVSQKQK', 'evm.model.Seq1.10': 'MESKSGEGKVVCVTGASGFIASWLVKLLLQRGYIVNATVRNLKDTSKV...ENFEDGLPLTPHFQVSSERAKCLGVKFTSLELSVKDTVESLMEKNFLHI', 'evm.model.Seq1.100': 'MVLLVEKISHFLKNPNRLENHHHNSEALLASSLQGFRSDVSKILNKVL...VDEVVKELRRRLKDLEELLQTIGKKTNGLFSEVLAERGKFLDSLQHTRK', 'evm.model.Seq1.1000': 'MDEAKVVEAKEGTISVATAFAGHQEAVRDRDHKFLTQAVEEAYKGVES...IGFDDFIADALRGTGFYQKAQLEIKQADGKGALIAEQVFEKTKEKFPIY', 'evm.model.Seq1.1001': 'MADKAVTIRTRKFMTNRLLARKQFVIDVLHPGRANVSKAELKEKLARM...KKYEPKYRLIRNGLDTKVEKSRKQMKERKNRAKKIRGVKKTKAGDAKKK', 'evm.model.Seq1.1002.2.5dee1a2d': 'MTAAPFLIESNLKYNPLLYTPNPIQYTRLLHNQKLTPSKLSKPTKLTV...YKLPTINGSGDLKEALQKIASIPSSRTLVSKRNGHQEALSFALLVAFNL', 'evm.model.Seq1.1003': 'MNVDQHGSSSRLYVSLKERIVKVQSAAANSSGAASPIIDEDLRESPIDNDIDEDPWKPPMDNDFPNNVQEMIWRILLPATIAMLMEQVLRVATL', 'evm.model.Seq1.1004': 'MNGLTHTEPEFSEFVEVDPTGRYGRYNEILGKGASKTVYRAFDEYEGI...SQRARKCEAIKGSPNVRDMVSTAKSFFTRTLLPNSLHRTTSLPVDAVDI', 'evm.model.Seq1.1005': 'MIPACFSIPHSEVSKTSSSPPSQVPQNLVTCIYQAHICGSPVYLTLTW...VSLLSSPSCSSVLQWAEESSECGRSSWSSMRSSEISEGFSLLLYAWRKD', 'evm.model.Seq1.1006.2.5dee1a2d.1.5df2f16e': 'MRVEISDDEGAEEPLVSNVDDVLKIIKSDYEKSYFVTGLFTSRIYAED...WRPLISVDGKTVYDLDEKLKIVKHVESWNISAFEAVGQILMPGLRSSGE', ...}, paralogs={'GF_000001': ['evm.model.Seq1.8294', 'evm.model.Seq3.747', 'evm.model.Seq11.759', 'evm.model.Seq3.973', 'evm.model.Seq6.3827', 'evm.model.Seq8.1981', 'evm.model.Seq7.3918', 'evm.model.Seq3.968', 'evm.model.Seq3.4625', 'evm.model.Seq7.4762', 'evm.model.Seq3.1299', 'evm.model.Seq6.1553', 'evm.model.Seq12.297', 'evm.model.Seq12.294', 'evm.model.Seq8.3726', 'evm.model.Seq12.2981', 'evm.model.Seq12.1584', 'evm.model.Seq12.2932', 'evm.model.Seq1.1275', 'evm.model.Seq9.3520', ...], 'GF_000002': ['evm.model.Seq10.2013', 'evm.model.Seq3.4200', 'evm.model.Seq9.2451', 'evm.model.Seq6.4919', 'evm.model.Seq6.2314', 'evm.model.Seq10.248', 'evm.model.Seq1.879', 'evm.model.Seq3.2753', 'evm.model.Seq5.3091', 'evm.model.Seq11.1052', 'evm.model.Seq9.321', 'evm.model.Seq2.1529', 'evm.model.Seq6.3951', 'evm.model.Seq3.3617', 'evm.model.Seq8.856', 'evm.model.Seq12.1958', 'evm.model.Seq11.845', 'evm.model.Seq5.2780', 'evm.model.Seq5.2776', 'evm.model.Seq6.1124', ...], 'GF_000003': ['evm.model.Seq5.2371', 'evm.model.Seq7.4835', 'evm.model.Seq4.444', 'evm.model.Seq4.968', 'evm.model.Seq9.2671', 'evm.model.Seq12.14', 'evm.model.Seq11.975', 'evm.model.Seq8.3209', 'evm.model.Seq3.2590', 'evm.model.Seq10.215', 'evm.model.Seq7.1965', 'evm.model.Seq6.2477', 'evm.model.Seq3.1843', 'evm.model.Seq10.326', 'evm.model.Seq8.2926', 'evm.model.Seq9.3554', 'evm.model.Seq5.1790', 'evm.model.Seq10.214', 'evm.model.Seq1.5216', 'evm.model.Seq8.2800', ...], 'GF_000004': ['evm.model.Seq4.117', 'evm.model.Seq4.119', 'evm.model.Seq10.263', 'evm.model.Seq8.2836', 'evm.model.Seq8.3722', 'evm.model.Seq8.3275', 'evm.model.Seq2.3139', 'evm.model.Seq9.3433', 'evm.model.Seq10.997', 'evm.model.Seq10.222', 'evm.model.Seq5.3282', 'evm.model.Seq7.1126', 'evm.model.Seq3.2258', 'evm.model.Seq3.2392', 'evm.model.Seq3.2952', 'evm.model.Seq10.661', 'evm.model.Seq10.962', 'evm.model.Seq5.2181', 'evm.model.Seq8.4192', 'evm.model.Seq7.939', ...], 'GF_000005': ['evm.model.Seq4.142', 'evm.model.Seq12.2076', 'evm.model.Seq4.777', 'evm.model.Seq4.942', 'evm.model.Seq8.3587', 'evm.model.Seq2.3451', 'evm.model.Seq4.4280', 'evm.model.Seq5.3422', 'evm.model.Seq10.2642', 'evm.model.Seq8.1232', 'evm.model.Seq7.1986', 'evm.model.Seq7.4403', 'evm.model.Seq1.1175.1.5dee19bd', 'evm.model.Seq4.4315', 'evm.model.Seq8.4788', 'evm.model.Seq2.3446', 'evm.model.Seq10.1699', 'evm.model.Seq4.630', 'evm.model.Seq8.3738', 'evm.model.Seq8.3412', ...], 'GF_000006': ['evm.model.Seq4.1211', 'evm.model.Seq1.68', 'evm.model.Seq7.143', 'evm.model.Seq5.3097', 'evm.model.Seq9.2047', 'evm.model.Seq5.5039', 'evm.model.Seq12.351', 'evm.model.Seq2.3909', 'evm.model.Seq2.2985', 'evm.model.Seq1.1130', 'evm.model.Seq1.5300', 'evm.model.Seq7.5023', 'evm.model.Seq1.1113', 'evm.model.Seq1.507', 'evm.model.Seq6.2682', 'evm.model.Seq5.2469', 'evm.model.Seq5.1744', 'evm.model.Seq1.7216', 'evm.model.Seq1.6709', 'evm.model.Seq6.1256', ...], 'GF_000007': ['evm.model.Seq8.4514', 'evm.model.Seq3.876', 'evm.model.Seq1.3868', 'evm.model.Seq8.2284', 'evm.model.Seq6.3859', 'evm.model.Seq11.1873', 'evm.model.Seq10.935', 'evm.model.Seq10.941', 'evm.model.Seq5.1583', 'evm.model.Seq5.1427', 'evm.model.Seq11.1718', 'evm.model.Seq4.2019', 'evm.model.Seq11.1270', 'evm.model.Seq5.2502', 'evm.model.Seq1.3832', 'evm.model.Seq6.4171', 'evm.model.Seq3.793', 'evm.model.Seq9.3163', 'evm.model.Seq3.1484', 'evm.model.Seq2.2035', ...], 'GF_000008': ['evm.model.Seq2.6037', 'evm.model.Seq2.5984', 'evm.model.Seq8.5170', 'evm.model.Seq2.2960', 'evm.model.Seq5.975', 'evm.model.Seq12.2519', 'evm.model.Seq11.2174', 'evm.model.Seq9.2679', 'evm.model.Seq2.6035', 'evm.model.Seq9.4017', 'evm.model.Seq8.5167', 'evm.model.Seq3.200', 'evm.model.Seq7.3340', 'evm.model.Seq12.2574', 'evm.model.Seq12.3595', 'evm.model.Seq5.930', 'evm.model.Seq12.3116', 'evm.model.Seq12.3207', 'evm.model.Seq12.902', 'evm.model.Seq7.6456', ...], 'GF_000009': ['evm.model.Seq12.647', 'evm.model.Seq9.2418', 'evm.model.Seq12.633', 'evm.model.Seq6.2786', 'evm.model.Seq9.2640', 'evm.model.Seq9.2435', 'evm.model.Seq2.2515', 'evm.model.Seq12.130', 'evm.model.Seq11.416', 'evm.model.Seq9.4139', 'evm.model.Seq9.4099', 'evm.model.Seq11.216', 'evm.model.Seq4.4661', 'evm.model.Seq9.3589', 'evm.model.Seq10.477', 'evm.model.Seq1.474', 'evm.model.Seq6.4340', 'evm.model.Seq6.3176', 'evm.model.Seq3.1058', 'evm.model.Seq12.322', ...], 'GF_000010': ['evm.model.Seq8.142', 'evm.model.Seq6.655', 'evm.model.Seq1.1336', 'evm.model.Seq1.8641', 'evm.model.Seq1.736', 'evm.model.Seq6.4210', 'evm.model.Seq1.4691', 'evm.model.Seq3.4496', 'evm.model.Seq10.2951', 'evm.model.Seq1.5450', 'evm.model.Seq9.4437', 'evm.model.Seq4.730', 'evm.model.Seq8.2624', 'evm.model.Seq1.486', 'evm.model.Seq12.504.1.5dee19aa', 'evm.model.Seq2.2930', 'evm.model.Seq2.3790', 'evm.model.Seq1.5849', 'evm.model.Seq7.3293', 'evm.model.Seq4.4437', ...], ...}, tmp_dir='/project/comparative_genomic_/wgd/wgd_pipline/ks_tmp.3858b1da53b934', output_dir='/project/comparative_genomic_/wgd/wgd_pipline/_longest_ksd', codeml_path='codeml', preserve=False, times=1, ignore_prefixes=False, n_threads=20, min_length=100, method='phyml', aligner='muscle', pairwise=False, max_pairwise=10000)
    640 
    641     Parallel(n_jobs=n_threads)(delayed(analysis_function)(
    642             family[0], protein[family[0]], nucleotide_sequences, tmp_dir,
    643             codeml_path, preserve, times, min_length, method, aligner,
    644             output_dir
--> 645         ) for family in sorted_families)
        sorted_families = [('GF_000017', 140), ('GF_000018', 137), ('GF_000019', 136), ('GF_000020', 132), ('GF_000021', 129), ('GF_000022', 127), ('GF_000023', 126), ('GF_000024', 121), ('GF_000025', 119), ('GF_000026', 119), ('GF_000027', 118), ('GF_000028', 116), ('GF_000029', 116), ('GF_000030', 115), ('GF_000031', 114), ('GF_000032', 114), ('GF_000033', 111), ('GF_000034', 108), ('GF_000035', 105), ('GF_000036', 102), ...]
    646     logging.info('Analysis done')
    647 
    648     logging.info('Making results data frame')
    649     results_frame = pd.DataFrame(

...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/joblib-0.11-py3.7.egg/joblib/parallel.py in __call__(self=Parallel(n_jobs=20), iterable=<generator object ks_analysis_paranome.<locals>.<genexpr>>)
    784             if pre_dispatch == "all" or n_jobs == 1:
    785                 # The iterable was consumed all at once by the above for loop.
    786                 # No need to wait for async callbacks to trigger to
    787                 # consumption.
    788                 self._iterating = False
--> 789             self.retrieve()
        self.retrieve = <bound method Parallel.retrieve of Parallel(n_jobs=20)>
    790             # Make sure that we get a last message telling us we are done
    791             elapsed_time = time.time() - self._start_time
    792             self._print('Done %3i out of %3i | elapsed: %s finished',
    793                         (len(self._output), len(self._output),

---------------------------------------------------------------------------
Sub-process traceback:
---------------------------------------------------------------------------
NewickError                                        Sat Apr  4 23:42:21 2020
PID: 27129Python 3.7.3: /software/Python-3.7.3/sxh_configure/bin/python3
...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/joblib-0.11-py3.7.egg/joblib/parallel.py in __call__(self=<joblib.parallel.BatchedCalls object>)
    126     def __init__(self, iterator_slice):
    127         self.items = list(iterator_slice)
    128         self._size = len(self.items)
    129 
    130     def __call__(self):
--> 131         return [func(*args, **kwargs) for func, args, kwargs in self.items]
        self.items = [(<function analyse_family>, ('GF_000026', {'evm.model.Seq1.1226': 'MGGGERSLLHLPHLHIHQGKKKTSDVPKGYLAIKVGQEEEEQQRFVVP...KEAEEVYGFHHKGTITIPCHVEQFRSIQGKIDKHHHNHHHHIHVPCFRA', 'evm.model.Seq1.1709': 'MRKFRGFVLKHRVTTLFRCMFRQRRRETARYHRLDQLPSWNGPTKSFS...IYINHPLFSELLREAEEEYGFNHPGGITIPCRISEFERVQTRIKQCRVG', 'evm.model.Seq1.1906': 'MGGGERSLLHLPHLHIHQGKKKTSDVPKGYLAIKVGQEEEEQQRFVVP...KEAEEVYGFHHKGTITIPCHVEQFRSIQGKIDKHHHNHHHHIHVPCFRA', 'evm.model.Seq1.3716': 'MSSKIGKSSKIRCIVRISQMLRQWKKRSLISSSKRIAPDVPAGHVAIS...LRFVSRSGSGRSINIEDFQKSCHARYRNSVENFGDSRPLLGGSTEKSVC', 'evm.model.Seq1.3717': 'MAIRKSNKMPQAAILKQILKRCSSLGKKHGYDDEDHLPNDVPKGHFAV...YIVPISFLTHPEFQCLLRQAEEEFGFDHDMGITIPCEEVVFRSLTSMLR', 'evm.model.Seq1.3718': 'MGIRKSNKLPQVILFKQIMKRCSRLAKKQSYGDVPKGHFAVYVGENRT...HPEFQCLLRCAEEEFGFDHDMGITIPCEEFIFQSMTSMLRYEKKKKSEY', 'evm.model.Seq1.3722': 'MAIRKSNKLPQVVLLKKILKRCSSLTKKHGYGDLDHIPNDVPKGHFAV...YIVPISFLNHPEFQCLLRCAEEEFGFDHDMGITIPCEEVIFQSLTSMLR', 'evm.model.Seq1.3723': 'MGIRKSNKLSQAAVLKQILSCSSLGMKQGYDDEFHLPIDVPKGHFAVY...FIVPISFLTHPEFQCLLQRAEEEFGFNHDMGITIPCEEAVFRSLTTMLR', 'evm.model.Seq1.3724': 'MAIRKSNKLPQAPILKQILKRCSSLGKKHVYDDDEDHLPVDVPKGYFT...YIIPISFLTHPDFQCLLRCAEEEFGFDHDMGITIPCEELVFQSLTSMIR', 'evm.model.Seq1.3725': 'MAIRKSNKLTQGAVVKQIIKRCSSLGKKHGYDDGDHLPMDVPKGHFAV...FWTHPEFQCLLHCAEEEFGFDHDMGITIPCEEVVFPITNFHTLVVEVSD', ...}, {'evm.model.Seq1.1': 'ATGTGGGGTTTGGCTAATTTTGAAAGATCTGATATCTCTCTCTCTTTA...TTGGAACTTGGAAAGCAAGGAATACGTTTTGTATCTCAGAAGCAAAAAC', 'evm.model.Seq1.10': 'ATGGAAAGCAAGAGTGGAGAAGGAAAGGTTGTATGTGTAACAGGGGCA...GAAGGATACTGTCGAAAGCTTGATGGAGAAGAACTTCCTCCATATCTAA', 'evm.model.Seq1.100': 'ATGGTACTCTTGGTTGAAAAGATTAGCCATTTTCTGAAAAATCCCAAC...AGCTGAAAGAGGTAAATTTCTTGATAGCCTTCAGCACACTAGAAAGTAG', 'evm.model.Seq1.1000': 'ATGGATGAAGCAAAGGTTGTTGAAGCTAAGGAGGGAACTATCTCTGTA...AGCCGAGCAAGTATTTGAGAAGACCAAAGAGAAGTTCCCCATCTATTGA', 'evm.model.Seq1.1001': 'ATGGCGGACAAGGCAGTCACCATCCGTACTCGCAAGTTCATGACCAAC...GATCCGTGGTGTAAAGAAGACCAAGGCTGGTGATGCAAAGAAGAAATAA', 'evm.model.Seq1.1002.2.5dee1a2d': 'ATGACTGCTGCTCCTTTCTTGATTGAATCAAACCTCAAATATAACCCT...TCATCAGGAAGCACTAAGTTTTGCGCTACTTGTAGCTTTTAATTTGTGA', 'evm.model.Seq1.1003': 'ATGAATGTTGATCAGCATGGTTCCAGCTCTAGGTTATATGTAAGTTTG...AACCATAGCTATGTTGATGGAACAGGTTTTAAGAGTGGCAACACTTTAA', 'evm.model.Seq1.1004': 'ATGAATGGTCTTACACATACAGAACCAGAGTTTTCTGAATTTGTTGAA...TTCACTTCACAGAACAACATCCTTACCAGTTGATGCTGTTGATATATAG', 'evm.model.Seq1.1005': 'ATGATTCCTGCTTGTTTTAGTATTCCTCATTCTGAGGTTTCAAAAACT...GATTAGTGAGGGATTTTCTTTGTTATTGTATGCGTGGAGGAAGGATTGA', 'evm.model.Seq1.1006.2.5dee1a2d.1.5df2f16e': 'ATGAGAGTGGAGATATCTGATGATGAAGGGGCTGAAGAACCTTTGGTC...AGCCGTTGGTCAGATATTAATGCCTGGTTTGAGGAGCTCCGGTGAGTGA', ...}, '/project/comparative_genomic_/wgd/wgd_pipline/ks_tmp.3858b1da53b934', 'codeml', False, 1, 100, 'phyml', 'muscle', '/project/comparative_genomic_/wgd/wgd_pipline/_longest_ksd'), {})]
    132 
    133     def __len__(self):
    134         return self._size
    135 

...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/joblib-0.11-py3.7.egg/joblib/parallel.py in <listcomp>(.0=<list_iterator object>)
    126     def __init__(self, iterator_slice):
    127         self.items = list(iterator_slice)
    128         self._size = len(self.items)
    129 
    130     def __call__(self):
--> 131         return [func(*args, **kwargs) for func, args, kwargs in self.items]
        func = <function analyse_family>
        args = ('GF_000026', {'evm.model.Seq1.1226': 'MGGGERSLLHLPHLHIHQGKKKTSDVPKGYLAIKVGQEEEEQQRFVVP...KEAEEVYGFHHKGTITIPCHVEQFRSIQGKIDKHHHNHHHHIHVPCFRA', 'evm.model.Seq1.1709': 'MRKFRGFVLKHRVTTLFRCMFRQRRRETARYHRLDQLPSWNGPTKSFS...IYINHPLFSELLREAEEEYGFNHPGGITIPCRISEFERVQTRIKQCRVG', 'evm.model.Seq1.1906': 'MGGGERSLLHLPHLHIHQGKKKTSDVPKGYLAIKVGQEEEEQQRFVVP...KEAEEVYGFHHKGTITIPCHVEQFRSIQGKIDKHHHNHHHHIHVPCFRA', 'evm.model.Seq1.3716': 'MSSKIGKSSKIRCIVRISQMLRQWKKRSLISSSKRIAPDVPAGHVAIS...LRFVSRSGSGRSINIEDFQKSCHARYRNSVENFGDSRPLLGGSTEKSVC', 'evm.model.Seq1.3717': 'MAIRKSNKMPQAAILKQILKRCSSLGKKHGYDDEDHLPNDVPKGHFAV...YIVPISFLTHPEFQCLLRQAEEEFGFDHDMGITIPCEEVVFRSLTSMLR', 'evm.model.Seq1.3718': 'MGIRKSNKLPQVILFKQIMKRCSRLAKKQSYGDVPKGHFAVYVGENRT...HPEFQCLLRCAEEEFGFDHDMGITIPCEEFIFQSMTSMLRYEKKKKSEY', 'evm.model.Seq1.3722': 'MAIRKSNKLPQVVLLKKILKRCSSLTKKHGYGDLDHIPNDVPKGHFAV...YIVPISFLNHPEFQCLLRCAEEEFGFDHDMGITIPCEEVIFQSLTSMLR', 'evm.model.Seq1.3723': 'MGIRKSNKLSQAAVLKQILSCSSLGMKQGYDDEFHLPIDVPKGHFAVY...FIVPISFLTHPEFQCLLQRAEEEFGFNHDMGITIPCEEAVFRSLTTMLR', 'evm.model.Seq1.3724': 'MAIRKSNKLPQAPILKQILKRCSSLGKKHVYDDDEDHLPVDVPKGYFT...YIIPISFLTHPDFQCLLRCAEEEFGFDHDMGITIPCEELVFQSLTSMIR', 'evm.model.Seq1.3725': 'MAIRKSNKLTQGAVVKQIIKRCSSLGKKHGYDDGDHLPMDVPKGHFAV...FWTHPEFQCLLHCAEEEFGFDHDMGITIPCEEVVFPITNFHTLVVEVSD', ...}, {'evm.model.Seq1.1': 'ATGTGGGGTTTGGCTAATTTTGAAAGATCTGATATCTCTCTCTCTTTA...TTGGAACTTGGAAAGCAAGGAATACGTTTTGTATCTCAGAAGCAAAAAC', 'evm.model.Seq1.10': 'ATGGAAAGCAAGAGTGGAGAAGGAAAGGTTGTATGTGTAACAGGGGCA...GAAGGATACTGTCGAAAGCTTGATGGAGAAGAACTTCCTCCATATCTAA', 'evm.model.Seq1.100': 'ATGGTACTCTTGGTTGAAAAGATTAGCCATTTTCTGAAAAATCCCAAC...AGCTGAAAGAGGTAAATTTCTTGATAGCCTTCAGCACACTAGAAAGTAG', 'evm.model.Seq1.1000': 'ATGGATGAAGCAAAGGTTGTTGAAGCTAAGGAGGGAACTATCTCTGTA...AGCCGAGCAAGTATTTGAGAAGACCAAAGAGAAGTTCCCCATCTATTGA', 'evm.model.Seq1.1001': 'ATGGCGGACAAGGCAGTCACCATCCGTACTCGCAAGTTCATGACCAAC...GATCCGTGGTGTAAAGAAGACCAAGGCTGGTGATGCAAAGAAGAAATAA', 'evm.model.Seq1.1002.2.5dee1a2d': 'ATGACTGCTGCTCCTTTCTTGATTGAATCAAACCTCAAATATAACCCT...TCATCAGGAAGCACTAAGTTTTGCGCTACTTGTAGCTTTTAATTTGTGA', 'evm.model.Seq1.1003': 'ATGAATGTTGATCAGCATGGTTCCAGCTCTAGGTTATATGTAAGTTTG...AACCATAGCTATGTTGATGGAACAGGTTTTAAGAGTGGCAACACTTTAA', 'evm.model.Seq1.1004': 'ATGAATGGTCTTACACATACAGAACCAGAGTTTTCTGAATTTGTTGAA...TTCACTTCACAGAACAACATCCTTACCAGTTGATGCTGTTGATATATAG', 'evm.model.Seq1.1005': 'ATGATTCCTGCTTGTTTTAGTATTCCTCATTCTGAGGTTTCAAAAACT...GATTAGTGAGGGATTTTCTTTGTTATTGTATGCGTGGAGGAAGGATTGA', 'evm.model.Seq1.1006.2.5dee1a2d.1.5df2f16e': 'ATGAGAGTGGAGATATCTGATGATGAAGGGGCTGAAGAACCTTTGGTC...AGCCGTTGGTCAGATATTAATGCCTGGTTTGAGGAGCTCCGGTGAGTGA', ...}, '/project/comparative_genomic_/wgd/wgd_pipline/ks_tmp.3858b1da53b934', 'codeml', False, 1, 100, 'phyml', 'muscle', '/project/comparative_genomic_/wgd/wgd_pipline/_longest_ksd')
        kwargs = {}
    132 
    133     def __len__(self):
    134         return self._size
    135 

...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd/ks_distribution.py in analyse_family(family_id='GF_000026', family={'evm.model.Seq1.1226': 'MGGGERSLLHLPHLHIHQGKKKTSDVPKGYLAIKVGQEEEEQQRFVVP...KEAEEVYGFHHKGTITIPCHVEQFRSIQGKIDKHHHNHHHHIHVPCFRA', 'evm.model.Seq1.1709': 'MRKFRGFVLKHRVTTLFRCMFRQRRRETARYHRLDQLPSWNGPTKSFS...IYINHPLFSELLREAEEEYGFNHPGGITIPCRISEFERVQTRIKQCRVG', 'evm.model.Seq1.1906': 'MGGGERSLLHLPHLHIHQGKKKTSDVPKGYLAIKVGQEEEEQQRFVVP...KEAEEVYGFHHKGTITIPCHVEQFRSIQGKIDKHHHNHHHHIHVPCFRA', 'evm.model.Seq1.3716': 'MSSKIGKSSKIRCIVRISQMLRQWKKRSLISSSKRIAPDVPAGHVAIS...LRFVSRSGSGRSINIEDFQKSCHARYRNSVENFGDSRPLLGGSTEKSVC', 'evm.model.Seq1.3717': 'MAIRKSNKMPQAAILKQILKRCSSLGKKHGYDDEDHLPNDVPKGHFAV...YIVPISFLTHPEFQCLLRQAEEEFGFDHDMGITIPCEEVVFRSLTSMLR', 'evm.model.Seq1.3718': 'MGIRKSNKLPQVILFKQIMKRCSRLAKKQSYGDVPKGHFAVYVGENRT...HPEFQCLLRCAEEEFGFDHDMGITIPCEEFIFQSMTSMLRYEKKKKSEY', 'evm.model.Seq1.3722': 'MAIRKSNKLPQVVLLKKILKRCSSLTKKHGYGDLDHIPNDVPKGHFAV...YIVPISFLNHPEFQCLLRCAEEEFGFDHDMGITIPCEEVIFQSLTSMLR', 'evm.model.Seq1.3723': 'MGIRKSNKLSQAAVLKQILSCSSLGMKQGYDDEFHLPIDVPKGHFAVY...FIVPISFLTHPEFQCLLQRAEEEFGFNHDMGITIPCEEAVFRSLTTMLR', 'evm.model.Seq1.3724': 'MAIRKSNKLPQAPILKQILKRCSSLGKKHVYDDDEDHLPVDVPKGYFT...YIIPISFLTHPDFQCLLRCAEEEFGFDHDMGITIPCEELVFQSLTSMIR', 'evm.model.Seq1.3725': 'MAIRKSNKLTQGAVVKQIIKRCSSLGKKHGYDDGDHLPMDVPKGHFAV...FWTHPEFQCLLHCAEEEFGFDHDMGITIPCEEVVFPITNFHTLVVEVSD', ...}, nucleotide={'evm.model.Seq1.1': 'ATGTGGGGTTTGGCTAATTTTGAAAGATCTGATATCTCTCTCTCTTTA...TTGGAACTTGGAAAGCAAGGAATACGTTTTGTATCTCAGAAGCAAAAAC', 'evm.model.Seq1.10': 'ATGGAAAGCAAGAGTGGAGAAGGAAAGGTTGTATGTGTAACAGGGGCA...GAAGGATACTGTCGAAAGCTTGATGGAGAAGAACTTCCTCCATATCTAA', 'evm.model.Seq1.100': 'ATGGTACTCTTGGTTGAAAAGATTAGCCATTTTCTGAAAAATCCCAAC...AGCTGAAAGAGGTAAATTTCTTGATAGCCTTCAGCACACTAGAAAGTAG', 'evm.model.Seq1.1000': 'ATGGATGAAGCAAAGGTTGTTGAAGCTAAGGAGGGAACTATCTCTGTA...AGCCGAGCAAGTATTTGAGAAGACCAAAGAGAAGTTCCCCATCTATTGA', 'evm.model.Seq1.1001': 'ATGGCGGACAAGGCAGTCACCATCCGTACTCGCAAGTTCATGACCAAC...GATCCGTGGTGTAAAGAAGACCAAGGCTGGTGATGCAAAGAAGAAATAA', 'evm.model.Seq1.1002.2.5dee1a2d': 'ATGACTGCTGCTCCTTTCTTGATTGAATCAAACCTCAAATATAACCCT...TCATCAGGAAGCACTAAGTTTTGCGCTACTTGTAGCTTTTAATTTGTGA', 'evm.model.Seq1.1003': 'ATGAATGTTGATCAGCATGGTTCCAGCTCTAGGTTATATGTAAGTTTG...AACCATAGCTATGTTGATGGAACAGGTTTTAAGAGTGGCAACACTTTAA', 'evm.model.Seq1.1004': 'ATGAATGGTCTTACACATACAGAACCAGAGTTTTCTGAATTTGTTGAA...TTCACTTCACAGAACAACATCCTTACCAGTTGATGCTGTTGATATATAG', 'evm.model.Seq1.1005': 'ATGATTCCTGCTTGTTTTAGTATTCCTCATTCTGAGGTTTCAAAAACT...GATTAGTGAGGGATTTTCTTTGTTATTGTATGCGTGGAGGAAGGATTGA', 'evm.model.Seq1.1006.2.5dee1a2d.1.5df2f16e': 'ATGAGAGTGGAGATATCTGATGATGAAGGGGCTGAAGAACCTTTGGTC...AGCCGTTGGTCAGATATTAATGCCTGGTTTGAGGAGCTCCGGTGAGTGA', ...}, tmp='/project/comparative_genomic_/wgd/wgd_pipline/ks_tmp.3858b1da53b934', codeml=<wgd.codeml.Codeml object>, preserve=False, times=1, min_length=100, method='phyml', aligner='muscle', output_dir='/project/comparative_genomic_/wgd/wgd_pipline/_longest_ksd')
    300         logging.debug("Distance will be in Ks units!")
    301         clustering, pairwise_distances, tree_path = _weighting(
    302                 results_dict, msa=msa_path_protein, method="alc")
    303     else:
    304         clustering, pairwise_distances, tree_path = _weighting(
--> 305                 results_dict, msa=msa_path_protein, method=method)
        results_dict = {'Ka':                      evm.model.Seq1.1226  ...  e......              0.0000

[119 rows x 119 columns], 'Ks':                      evm.model.Seq1.1226  ...  e......              0.0000

[119 rows x 119 columns], 'Omega':                      evm.model.Seq1.1226  ...  e......              0.0000

[119 rows x 119 columns]}
        msa_path_protein = '/project/comparative_genomic_/wg...pipline/ks_tmp.3858b1da53b934/GF_000026.fasta.msa'
        method = 'phyml'
    306     if clustering is not None:
    307         out = _calculate_weighted_ks(
    308                 clustering, results_dict, pairwise_distances, family_id
    309         )

...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd/ks_distribution.py in _weighting(pairwise_estimates={'Ka':                      evm.model.Seq1.1226  ...  e......              0.0000

[119 rows x 119 columns], 'Ks':                      evm.model.Seq1.1226  ...  e......              0.0000

[119 rows x 119 columns], 'Omega':                      evm.model.Seq1.1226  ...  e......              0.0000

[119 rows x 119 columns]}, msa='/project/comparative_genomic_/wg...pipline/ks_tmp.3858b1da53b934/GF_000026.fasta.msa', method='phyml')
     86     if method == 'phyml':
     87         # PhyML tree construction
     88         logging.debug('Constructing phylogenetic tree with PhyML')
     89         tree_path = run_phyml(msa)
     90         clustering, pairwise_distances = phylogenetic_tree_to_cluster_format(
---> 91                 tree_path, pairwise_estimates['Ks'])
        tree_path = '/project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026.fasta.msa.nw'
        pairwise_estimates = {'Ka':                      evm.model.Seq1.1226  ...  e......              0.0000

[119 rows x 119 columns], 'Ks':                      evm.model.Seq1.1226  ...  e......              0.0000

[119 rows x 119 columns], 'Omega':                      evm.model.Seq1.1226  ...  e......              0.0000

[119 rows x 119 columns]}
     92 
     93     elif method == 'fasttree':
     94         # FastTree tree construction
     95         logging.debug('Constructing phylogenetic tree with FastTree')

...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd/phy.py in phylogenetic_tree_to_cluster_format(tree='/project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026.fasta.msa.nw', pairwise_estimates=                     evm.model.Seq1.1226  ...  e......              0.0000

[119 rows x 119 columns])
    118         (only the index is used)
    119     :return: clustering data structure, pairwise distances dictionary
    120     """
    121     id_map = {
    122         pairwise_estimates.index[i]: i for i in range(len(pairwise_estimates))}
--> 123     t = Tree(tree)
        t = undefined
        tree = '/project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026.fasta.msa.nw'
    124 
    125     # midpoint rooting
    126     midpoint = t.get_midpoint_outgroup()
    127     if not midpoint:  # midpoint = None when their are only two leaves

...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/ete3-3.1.1-py3.7.egg/ete3/coretype/tree.py in __init__(self=Tree node '' (0x2b936ceba5c), newick='/project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026.fasta.msa.nw', format=0, dist=None, support=None, name=None, quoted_node_names=False)
    206 
    207         # Initialize tree
    208         if newick is not None:
    209             self._dist = 0.0
    210             read_newick(newick, root_node = self, format=format,
--> 211                         quoted_names=quoted_node_names)
        quoted_node_names = False
    212 
    213 
    214     def __nonzero__(self):
    215         return True

...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/ete3-3.1.1-py3.7.egg/ete3/parser/newick.py in read_newick(newick='/project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026.fasta.msa.nw', root_node=Tree node '' (0x2b936ceba5c), format=0, quoted_names=False)
    244         nw = nw.strip()
    245         if not nw.startswith('(') and nw.endswith(';'):
    246             #return _read_node_data(nw[:-1], root_node, "single", matcher, format)
    247             return _read_newick_from_string(nw, root_node, matcher, format, quoted_names)
    248         elif not nw.startswith('(') or not nw.endswith(';'):
--> 249             raise NewickError('Unexisting tree file or Malformed newick tree structure.')
    250         else:
    251             return _read_newick_from_string(nw, root_node, matcher, format, quoted_names)
    252 
    253     else:

NewickError: Unexisting tree file or Malformed newick tree structure.
You may want to check other newick loading flags like 'format' or 'quoted_node_names'.
arzwa commented 4 years ago

Hi, can you locate the file /project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026.fasta.msa.nw and report what is in there (if you find the file). Also, do you get the other files for this gene family (all files starting with /project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026)?

U201412486 commented 4 years ago

Hi, I find the file named by GF_000026.fasta.msa.nw . The file is empty.And I get the other files for this gene family like below. ks_tmp.3858b1da53b934]$ ls GF_000026* GF_000026.codeml GF_000026.fasta GF_000026.fasta.msa GF_000026.fasta.msa.nuc GF_000026.fasta.msa.nw All the files named by *.nw are empty.

arzwa commented 4 years ago

You mean for the other families the .nw files are also empty? Could you paste an example alignment here one of these families (e.g. GF_000026.fasta.msa)?

U201412486 commented 4 years ago

I am sorry to the later reply.When I try to run the code on the another computer.It report another error. 2020-04-11 21:29:14: INFO Started analysis in parallel (n_threads = 48) Traceback (most recent call last): File "software/wgd/wgd/wgd", line 11, in <module> load_entry_point('wgd==1.1', 'console_scripts', 'wgd')() File ".local/lib/python3.7/site-packages/click/core.py", line 764, in __call__ return self.main(*args, **kwargs) File ".local/lib/python3.7/site-packages/click/core.py", line 717, in main rv = self.invoke(ctx) File ".local/lib/python3.7/site-packages/click/core.py", line 1137, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File ".local/lib/python3.7/site-packages/click/core.py", line 956, in invoke return ctx.invoke(self.callback, **ctx.params) File ".local/lib/python3.7/site-packages/click/core.py", line 555, in invoke return callback(*args, **kwargs) File "software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd_cli.py", line 632, in ksd File "software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd_cli.py", line 773, in ksd_ File "software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd/ks_distribution.py", line 645, in ks_analysis_paranome File "software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/joblib-0.11-py3.7.egg/joblib/parallel.py", line 749, in __call__ n_jobs = self._initialize_backend() File "software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/joblib-0.11-py3.7.egg/joblib/parallel.py", line 547, in _initialize_backend **self._backend_args) File "software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/joblib-0.11-py3.7.egg/joblib/_parallel_backends.py", line 317, in configure self._pool = MemmapingPool(n_jobs, **backend_args) File "software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/joblib-0.11-py3.7.egg/joblib/pool.py", line 600, in __init__ super(MemmapingPool, self).__init__(**poolargs) File "software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/joblib-0.11-py3.7.egg/joblib/pool.py", line 420, in __init__ super(PicklingPool, self).__init__(**poolargs) File "software/Python-3.7.3/sxh_configure/lib/python3.7/multiprocessing/pool.py", line 176, in __init__ self._repopulate_pool() File "software/Python-3.7.3/sxh_configure/lib/python3.7/multiprocessing/pool.py", line 241, in _repopulate_pool w.start() File "software/Python-3.7.3/sxh_configure/lib/python3.7/multiprocessing/process.py", line 112, in start self._popen = self._Popen(self) File "software/Python-3.7.3/sxh_configure/lib/python3.7/multiprocessing/context.py", line 277, in _Popen return Popen(process_obj) File "software/Python-3.7.3/sxh_configure/lib/python3.7/multiprocessing/popen_fork.py", line 20, in __init__ self._launch(process_obj) File "software/Python-3.7.3/sxh_configure/lib/python3.7/multiprocessing/popen_fork.py", line 70, in _launch self.pid = os.fork() OSError: [Errno 12] Cannot allocate memory And I success in the examle data.Is it due to my big inputs? Thanks,

arzwa commented 4 years ago

Not sure, have you tried using less threads? 48 sounds like a lot? Was your previous issue solved?