Closed asher-616 closed 5 years ago
Hi, I shortened your error message to the relevant part (it' so awfully long because of the multiprocessing). There seems to be an error in the newick tree for GF000012
; You can find this here /groups/pupko/ashermoshe/dorothee/ks_tmp.371cd0d9072862/GF_000012.fasta.msa.nw
, could you show me that file?
Also, can I close the issue opened yesterday by the user dvory-tau
, since it seems this is the same use case and you managed to fix the problems you (or someone else) had yesterday?
Oh, I see what is probably causing the problem. Your gene (transcript) names contain colons ::
. That will obviously mess up parsing a newick file, since colons are used to specify branch lengths... Please replace the colons in the transcript identifiers (e.g. sed 's/::/../g' Botryllus_schlosseri.fas > Botryllus_schlosseri_renamed.fas
and sed 's/::/../g' schlosseri.mcl.fas > schlosseri_renamed.mcl
). Can you let me know if that fixes it?
Thank you for your quick reply. I tried the suggestion, and I think it did fix the issue, but I now have a new error
2019-03-06 14:40:21: INFO Making results data frame
Traceback (most recent call last):
File "/share/apps/anaconda3-5.1.0/bin/wgd", line 11, in <module>
sys.exit(cli())
File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/click/core.py", line 722, in __call__
return self.main(*args, **kwargs)
File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/wgd_cli.py", line 545, in ksd
max_pairwise=max_pairwise
File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/wgd_cli.py", line 686, in ksd_
max_pairwise=max_pairwise,
File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/wgd/ks_distribution.py", line 654, in ks_analysis_paranome
results_frame = pd.concat([results_frame, df], sort=True)
TypeError: concat() got an unexpected keyword argument 'sort'
Hi, I believe this might be an issue with the version of the pandas
python package. Could you check your pandas version using pip freeze | grep pandas
? I have pandas==0.24.1
and have no troubles.
In the meantime I have updated the setup.py
file to include package version numbers, so re-installing wgd
should resolve most issues stemming from incompatible dependency versions.
Hi, have these problems been resolved in the meantime?
Hi, The problems have been resolved. Thank you for the support
On Thu, Mar 14, 2019 at 10:28 AM Arthur Zwaenepoel notifications@github.com wrote:
Hi, have these problems been resolved in the meantime?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/arzwa/wgd/issues/14#issuecomment-472750726, or mute the thread https://github.com/notifications/unsubscribe-auth/At2xSfXBjCaP9CNieYTUaWllYDZEryACks5vWggYgaJpZM4bWM8- .
-- Asher Moshe Pupko lab, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
Hi,
I get the error below when I run the command wgd ksd a.mcl a.cds
. Can you give me some advice?
[119 rows x 119 columns], 'Ks': evm.model.Seq1.1226 ... e...... 0.0000
[119 rows x 119 columns], 'Omega': evm.model.Seq1.1226 ... e...... 0.0000
[119 rows x 119 columns]}
92
93 elif method == 'fasttree':
94 # FastTree tree construction
95 logging.debug('Constructing phylogenetic tree with FastTree')
...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd/phy.py in phylogenetic_tree_to_cluster_format(tree='/project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026.fasta.msa.nw', pairwise_estimates= evm.model.Seq1.1226 ... e...... 0.0000
[119 rows x 119 columns])
118 (only the index is used)
119 :return: clustering data structure, pairwise distances dictionary
120 """
121 id_map = {
122 pairwise_estimates.index[i]: i for i in range(len(pairwise_estimates))}
--> 123 t = Tree(tree)
t = undefined
tree = '/project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026.fasta.msa.nw'
124
125 # midpoint rooting
126 midpoint = t.get_midpoint_outgroup()
127 if not midpoint: # midpoint = None when their are only two leaves
...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/ete3-3.1.1-py3.7.egg/ete3/coretype/tree.py in __init__(self=Tree node '' (0x2b936ceba5c), newick='/project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026.fasta.msa.nw', format=0, dist=None, support=None, name=None, quoted_node_names=False)
206
207 # Initialize tree
208 if newick is not None:
209 self._dist = 0.0
210 read_newick(newick, root_node = self, format=format,
--> 211 quoted_names=quoted_node_names)
quoted_node_names = False
212
213
214 def __nonzero__(self):
215 return True
...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/ete3-3.1.1-py3.7.egg/ete3/parser/newick.py in read_newick(newick='/project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026.fasta.msa.nw', root_node=Tree node '' (0x2b936ceba5c), format=0, quoted_names=False)
244 nw = nw.strip()
245 if not nw.startswith('(') and nw.endswith(';'):
246 #return _read_node_data(nw[:-1], root_node, "single", matcher, format)
247 return _read_newick_from_string(nw, root_node, matcher, format, quoted_names)
248 elif not nw.startswith('(') or not nw.endswith(';'):
--> 249 raise NewickError('Unexisting tree file or Malformed newick tree structure.')
250 else:
251 return _read_newick_from_string(nw, root_node, matcher, format, quoted_names)
252
253 else:
NewickError: Unexisting tree file or Malformed newick tree structure.
You may want to check other newick loading flags like 'format' or 'quoted_node_names'.
___________________________________________________________________________
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/joblib-0.11-py3.7.egg/joblib/parallel.py", line 699, in retrieve
self._output.extend(job.get(timeout=self.timeout))
File "/software/Python-3.7.3/sxh_configure/lib/python3.7/multiprocessing/pool.py", line 657, in get
raise self._value
joblib.my_exceptions.TransportableException: TransportableException
___________________________________________________________________________
NewickError Sat Apr 4 23:42:21 2020
PID: 27129Python 3.7.3: /software/Python-3.7.3/sxh_configure/bin/python3
...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/joblib-0.11-py3.7.egg/joblib/parallel.py in __call__(self=<joblib.parallel.BatchedCalls object>)
126 def __init__(self, iterator_slice):
127 self.items = list(iterator_slice)
128 self._size = len(self.items)
129
130 def __call__(self):
--> 131 return [func(*args, **kwargs) for func, args, kwargs in self.items]
self.items = [(<function analyse_family>, ('GF_000026', {'evm.model.Seq1.1226': 'MGGGERSLLHLPHLHIHQGKKKTSDVPKGYLAIKVGQEEEEQQRFVVP...KEAEEVYGFHHKGTITIPCHVEQFRSIQGKIDKHHHNHHHHIHVPCFRA', 'evm.model.Seq1.1709': 'MRKFRGFVLKHRVTTLFRCMFRQRRRETARYHRLDQLPSWNGPTKSFS...IYINHPLFSELLREAEEEYGFNHPGGITIPCRISEFERVQTRIKQCRVG', 'evm.model.Seq1.1906': 'MGGGERSLLHLPHLHIHQGKKKTSDVPKGYLAIKVGQEEEEQQRFVVP...KEAEEVYGFHHKGTITIPCHVEQFRSIQGKIDKHHHNHHHHIHVPCFRA', 'evm.model.Seq1.3716': 'MSSKIGKSSKIRCIVRISQMLRQWKKRSLISSSKRIAPDVPAGHVAIS...LRFVSRSGSGRSINIEDFQKSCHARYRNSVENFGDSRPLLGGSTEKSVC', 'evm.model.Seq1.3717': 'MAIRKSNKMPQAAILKQILKRCSSLGKKHGYDDEDHLPNDVPKGHFAV...YIVPISFLTHPEFQCLLRQAEEEFGFDHDMGITIPCEEVVFRSLTSMLR', 'evm.model.Seq1.3718': 'MGIRKSNKLPQVILFKQIMKRCSRLAKKQSYGDVPKGHFAVYVGENRT...HPEFQCLLRCAEEEFGFDHDMGITIPCEEFIFQSMTSMLRYEKKKKSEY', 'evm.model.Seq1.3722': 'MAIRKSNKLPQVVLLKKILKRCSSLTKKHGYGDLDHIPNDVPKGHFAV...YIVPISFLNHPEFQCLLRCAEEEFGFDHDMGITIPCEEVIFQSLTSMLR', 'evm.model.Seq1.3723': 'MGIRKSNKLSQAAVLKQILSCSSLGMKQGYDDEFHLPIDVPKGHFAVY...FIVPISFLTHPEFQCLLQRAEEEFGFNHDMGITIPCEEAVFRSLTTMLR', 'evm.model.Seq1.3724': 'MAIRKSNKLPQAPILKQILKRCSSLGKKHVYDDDEDHLPVDVPKGYFT...YIIPISFLTHPDFQCLLRCAEEEFGFDHDMGITIPCEELVFQSLTSMIR', 'evm.model.Seq1.3725': 'MAIRKSNKLTQGAVVKQIIKRCSSLGKKHGYDDGDHLPMDVPKGHFAV...FWTHPEFQCLLHCAEEEFGFDHDMGITIPCEEVVFPITNFHTLVVEVSD', ...}, {'evm.model.Seq1.1': 'ATGTGGGGTTTGGCTAATTTTGAAAGATCTGATATCTCTCTCTCTTTA...TTGGAACTTGGAAAGCAAGGAATACGTTTTGTATCTCAGAAGCAAAAAC', 'evm.model.Seq1.10': 'ATGGAAAGCAAGAGTGGAGAAGGAAAGGTTGTATGTGTAACAGGGGCA...GAAGGATACTGTCGAAAGCTTGATGGAGAAGAACTTCCTCCATATCTAA', 'evm.model.Seq1.100': 'ATGGTACTCTTGGTTGAAAAGATTAGCCATTTTCTGAAAAATCCCAAC...AGCTGAAAGAGGTAAATTTCTTGATAGCCTTCAGCACACTAGAAAGTAG', 'evm.model.Seq1.1000': 'ATGGATGAAGCAAAGGTTGTTGAAGCTAAGGAGGGAACTATCTCTGTA...AGCCGAGCAAGTATTTGAGAAGACCAAAGAGAAGTTCCCCATCTATTGA', 'evm.model.Seq1.1001': 'ATGGCGGACAAGGCAGTCACCATCCGTACTCGCAAGTTCATGACCAAC...GATCCGTGGTGTAAAGAAGACCAAGGCTGGTGATGCAAAGAAGAAATAA', 'evm.model.Seq1.1002.2.5dee1a2d': 'ATGACTGCTGCTCCTTTCTTGATTGAATCAAACCTCAAATATAACCCT...TCATCAGGAAGCACTAAGTTTTGCGCTACTTGTAGCTTTTAATTTGTGA', 'evm.model.Seq1.1003': 'ATGAATGTTGATCAGCATGGTTCCAGCTCTAGGTTATATGTAAGTTTG...AACCATAGCTATGTTGATGGAACAGGTTTTAAGAGTGGCAACACTTTAA', 'evm.model.Seq1.1004': 'ATGAATGGTCTTACACATACAGAACCAGAGTTTTCTGAATTTGTTGAA...TTCACTTCACAGAACAACATCCTTACCAGTTGATGCTGTTGATATATAG', 'evm.model.Seq1.1005': 'ATGATTCCTGCTTGTTTTAGTATTCCTCATTCTGAGGTTTCAAAAACT...GATTAGTGAGGGATTTTCTTTGTTATTGTATGCGTGGAGGAAGGATTGA', 'evm.model.Seq1.1006.2.5dee1a2d.1.5df2f16e': 'ATGAGAGTGGAGATATCTGATGATGAAGGGGCTGAAGAACCTTTGGTC...AGCCGTTGGTCAGATATTAATGCCTGGTTTGAGGAGCTCCGGTGAGTGA', ...}, '/project/comparative_genomic_/wgd/wgd_pipline/ks_tmp.3858b1da53b934', 'codeml', False, 1, 100, 'phyml', 'muscle', '/project/comparative_genomic_/wgd/wgd_pipline/_longest_ksd'), {})]
132
133 def __len__(self):
134 return self._size
135
...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/joblib-0.11-py3.7.egg/joblib/parallel.py in <listcomp>(.0=<list_iterator object>)
126 def __init__(self, iterator_slice):
127 self.items = list(iterator_slice)
128 self._size = len(self.items)
129
130 def __call__(self):
--> 131 return [func(*args, **kwargs) for func, args, kwargs in self.items]
func = <function analyse_family>
args = ('GF_000026', {'evm.model.Seq1.1226': 'MGGGERSLLHLPHLHIHQGKKKTSDVPKGYLAIKVGQEEEEQQRFVVP...KEAEEVYGFHHKGTITIPCHVEQFRSIQGKIDKHHHNHHHHIHVPCFRA', 'evm.model.Seq1.1709': 'MRKFRGFVLKHRVTTLFRCMFRQRRRETARYHRLDQLPSWNGPTKSFS...IYINHPLFSELLREAEEEYGFNHPGGITIPCRISEFERVQTRIKQCRVG', 'evm.model.Seq1.1906': 'MGGGERSLLHLPHLHIHQGKKKTSDVPKGYLAIKVGQEEEEQQRFVVP...KEAEEVYGFHHKGTITIPCHVEQFRSIQGKIDKHHHNHHHHIHVPCFRA', 'evm.model.Seq1.3716': 'MSSKIGKSSKIRCIVRISQMLRQWKKRSLISSSKRIAPDVPAGHVAIS...LRFVSRSGSGRSINIEDFQKSCHARYRNSVENFGDSRPLLGGSTEKSVC', 'evm.model.Seq1.3717': 'MAIRKSNKMPQAAILKQILKRCSSLGKKHGYDDEDHLPNDVPKGHFAV...YIVPISFLTHPEFQCLLRQAEEEFGFDHDMGITIPCEEVVFRSLTSMLR', 'evm.model.Seq1.3718': 'MGIRKSNKLPQVILFKQIMKRCSRLAKKQSYGDVPKGHFAVYVGENRT...HPEFQCLLRCAEEEFGFDHDMGITIPCEEFIFQSMTSMLRYEKKKKSEY', 'evm.model.Seq1.3722': 'MAIRKSNKLPQVVLLKKILKRCSSLTKKHGYGDLDHIPNDVPKGHFAV...YIVPISFLNHPEFQCLLRCAEEEFGFDHDMGITIPCEEVIFQSLTSMLR', 'evm.model.Seq1.3723': 'MGIRKSNKLSQAAVLKQILSCSSLGMKQGYDDEFHLPIDVPKGHFAVY...FIVPISFLTHPEFQCLLQRAEEEFGFNHDMGITIPCEEAVFRSLTTMLR', 'evm.model.Seq1.3724': 'MAIRKSNKLPQAPILKQILKRCSSLGKKHVYDDDEDHLPVDVPKGYFT...YIIPISFLTHPDFQCLLRCAEEEFGFDHDMGITIPCEELVFQSLTSMIR', 'evm.model.Seq1.3725': 'MAIRKSNKLTQGAVVKQIIKRCSSLGKKHGYDDGDHLPMDVPKGHFAV...FWTHPEFQCLLHCAEEEFGFDHDMGITIPCEEVVFPITNFHTLVVEVSD', ...}, {'evm.model.Seq1.1': 'ATGTGGGGTTTGGCTAATTTTGAAAGATCTGATATCTCTCTCTCTTTA...TTGGAACTTGGAAAGCAAGGAATACGTTTTGTATCTCAGAAGCAAAAAC', 'evm.model.Seq1.10': 'ATGGAAAGCAAGAGTGGAGAAGGAAAGGTTGTATGTGTAACAGGGGCA...GAAGGATACTGTCGAAAGCTTGATGGAGAAGAACTTCCTCCATATCTAA', 'evm.model.Seq1.100': 'ATGGTACTCTTGGTTGAAAAGATTAGCCATTTTCTGAAAAATCCCAAC...AGCTGAAAGAGGTAAATTTCTTGATAGCCTTCAGCACACTAGAAAGTAG', 'evm.model.Seq1.1000': 'ATGGATGAAGCAAAGGTTGTTGAAGCTAAGGAGGGAACTATCTCTGTA...AGCCGAGCAAGTATTTGAGAAGACCAAAGAGAAGTTCCCCATCTATTGA', 'evm.model.Seq1.1001': 'ATGGCGGACAAGGCAGTCACCATCCGTACTCGCAAGTTCATGACCAAC...GATCCGTGGTGTAAAGAAGACCAAGGCTGGTGATGCAAAGAAGAAATAA', 'evm.model.Seq1.1002.2.5dee1a2d': 'ATGACTGCTGCTCCTTTCTTGATTGAATCAAACCTCAAATATAACCCT...TCATCAGGAAGCACTAAGTTTTGCGCTACTTGTAGCTTTTAATTTGTGA', 'evm.model.Seq1.1003': 'ATGAATGTTGATCAGCATGGTTCCAGCTCTAGGTTATATGTAAGTTTG...AACCATAGCTATGTTGATGGAACAGGTTTTAAGAGTGGCAACACTTTAA', 'evm.model.Seq1.1004': 'ATGAATGGTCTTACACATACAGAACCAGAGTTTTCTGAATTTGTTGAA...TTCACTTCACAGAACAACATCCTTACCAGTTGATGCTGTTGATATATAG', 'evm.model.Seq1.1005': 'ATGATTCCTGCTTGTTTTAGTATTCCTCATTCTGAGGTTTCAAAAACT...GATTAGTGAGGGATTTTCTTTGTTATTGTATGCGTGGAGGAAGGATTGA', 'evm.model.Seq1.1006.2.5dee1a2d.1.5df2f16e': 'ATGAGAGTGGAGATATCTGATGATGAAGGGGCTGAAGAACCTTTGGTC...AGCCGTTGGTCAGATATTAATGCCTGGTTTGAGGAGCTCCGGTGAGTGA', ...}, '/project/comparative_genomic_/wgd/wgd_pipline/ks_tmp.3858b1da53b934', 'codeml', False, 1, 100, 'phyml', 'muscle', '/project/comparative_genomic_/wgd/wgd_pipline/_longest_ksd')
kwargs = {}
132
133 def __len__(self):
134 return self._size
135
...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd/ks_distribution.py in analyse_family(family_id='GF_000026', family={'evm.model.Seq1.1226': 'MGGGERSLLHLPHLHIHQGKKKTSDVPKGYLAIKVGQEEEEQQRFVVP...KEAEEVYGFHHKGTITIPCHVEQFRSIQGKIDKHHHNHHHHIHVPCFRA', 'evm.model.Seq1.1709': 'MRKFRGFVLKHRVTTLFRCMFRQRRRETARYHRLDQLPSWNGPTKSFS...IYINHPLFSELLREAEEEYGFNHPGGITIPCRISEFERVQTRIKQCRVG', 'evm.model.Seq1.1906': 'MGGGERSLLHLPHLHIHQGKKKTSDVPKGYLAIKVGQEEEEQQRFVVP...KEAEEVYGFHHKGTITIPCHVEQFRSIQGKIDKHHHNHHHHIHVPCFRA', 'evm.model.Seq1.3716': 'MSSKIGKSSKIRCIVRISQMLRQWKKRSLISSSKRIAPDVPAGHVAIS...LRFVSRSGSGRSINIEDFQKSCHARYRNSVENFGDSRPLLGGSTEKSVC', 'evm.model.Seq1.3717': 'MAIRKSNKMPQAAILKQILKRCSSLGKKHGYDDEDHLPNDVPKGHFAV...YIVPISFLTHPEFQCLLRQAEEEFGFDHDMGITIPCEEVVFRSLTSMLR', 'evm.model.Seq1.3718': 'MGIRKSNKLPQVILFKQIMKRCSRLAKKQSYGDVPKGHFAVYVGENRT...HPEFQCLLRCAEEEFGFDHDMGITIPCEEFIFQSMTSMLRYEKKKKSEY', 'evm.model.Seq1.3722': 'MAIRKSNKLPQVVLLKKILKRCSSLTKKHGYGDLDHIPNDVPKGHFAV...YIVPISFLNHPEFQCLLRCAEEEFGFDHDMGITIPCEEVIFQSLTSMLR', 'evm.model.Seq1.3723': 'MGIRKSNKLSQAAVLKQILSCSSLGMKQGYDDEFHLPIDVPKGHFAVY...FIVPISFLTHPEFQCLLQRAEEEFGFNHDMGITIPCEEAVFRSLTTMLR', 'evm.model.Seq1.3724': 'MAIRKSNKLPQAPILKQILKRCSSLGKKHVYDDDEDHLPVDVPKGYFT...YIIPISFLTHPDFQCLLRCAEEEFGFDHDMGITIPCEELVFQSLTSMIR', 'evm.model.Seq1.3725': 'MAIRKSNKLTQGAVVKQIIKRCSSLGKKHGYDDGDHLPMDVPKGHFAV...FWTHPEFQCLLHCAEEEFGFDHDMGITIPCEEVVFPITNFHTLVVEVSD', ...}, nucleotide={'evm.model.Seq1.1': 'ATGTGGGGTTTGGCTAATTTTGAAAGATCTGATATCTCTCTCTCTTTA...TTGGAACTTGGAAAGCAAGGAATACGTTTTGTATCTCAGAAGCAAAAAC', 'evm.model.Seq1.10': 'ATGGAAAGCAAGAGTGGAGAAGGAAAGGTTGTATGTGTAACAGGGGCA...GAAGGATACTGTCGAAAGCTTGATGGAGAAGAACTTCCTCCATATCTAA', 'evm.model.Seq1.100': 'ATGGTACTCTTGGTTGAAAAGATTAGCCATTTTCTGAAAAATCCCAAC...AGCTGAAAGAGGTAAATTTCTTGATAGCCTTCAGCACACTAGAAAGTAG', 'evm.model.Seq1.1000': 'ATGGATGAAGCAAAGGTTGTTGAAGCTAAGGAGGGAACTATCTCTGTA...AGCCGAGCAAGTATTTGAGAAGACCAAAGAGAAGTTCCCCATCTATTGA', 'evm.model.Seq1.1001': 'ATGGCGGACAAGGCAGTCACCATCCGTACTCGCAAGTTCATGACCAAC...GATCCGTGGTGTAAAGAAGACCAAGGCTGGTGATGCAAAGAAGAAATAA', 'evm.model.Seq1.1002.2.5dee1a2d': 'ATGACTGCTGCTCCTTTCTTGATTGAATCAAACCTCAAATATAACCCT...TCATCAGGAAGCACTAAGTTTTGCGCTACTTGTAGCTTTTAATTTGTGA', 'evm.model.Seq1.1003': 'ATGAATGTTGATCAGCATGGTTCCAGCTCTAGGTTATATGTAAGTTTG...AACCATAGCTATGTTGATGGAACAGGTTTTAAGAGTGGCAACACTTTAA', 'evm.model.Seq1.1004': 'ATGAATGGTCTTACACATACAGAACCAGAGTTTTCTGAATTTGTTGAA...TTCACTTCACAGAACAACATCCTTACCAGTTGATGCTGTTGATATATAG', 'evm.model.Seq1.1005': 'ATGATTCCTGCTTGTTTTAGTATTCCTCATTCTGAGGTTTCAAAAACT...GATTAGTGAGGGATTTTCTTTGTTATTGTATGCGTGGAGGAAGGATTGA', 'evm.model.Seq1.1006.2.5dee1a2d.1.5df2f16e': 'ATGAGAGTGGAGATATCTGATGATGAAGGGGCTGAAGAACCTTTGGTC...AGCCGTTGGTCAGATATTAATGCCTGGTTTGAGGAGCTCCGGTGAGTGA', ...}, tmp='/project/comparative_genomic_/wgd/wgd_pipline/ks_tmp.3858b1da53b934', codeml=<wgd.codeml.Codeml object>, preserve=False, times=1, min_length=100, method='phyml', aligner='muscle', output_dir='/project/comparative_genomic_/wgd/wgd_pipline/_longest_ksd')
300 logging.debug("Distance will be in Ks units!")
301 clustering, pairwise_distances, tree_path = _weighting(
302 results_dict, msa=msa_path_protein, method="alc")
303 else:
304 clustering, pairwise_distances, tree_path = _weighting(
--> 305 results_dict, msa=msa_path_protein, method=method)
results_dict = {'Ka': evm.model.Seq1.1226 ... e...... 0.0000
[119 rows x 119 columns], 'Ks': evm.model.Seq1.1226 ... e...... 0.0000
[119 rows x 119 columns], 'Omega': evm.model.Seq1.1226 ... e...... 0.0000
[119 rows x 119 columns]}
msa_path_protein = '/project/comparative_genomic_/wg...pipline/ks_tmp.3858b1da53b934/GF_000026.fasta.msa'
method = 'phyml'
306 if clustering is not None:
307 out = _calculate_weighted_ks(
308 clustering, results_dict, pairwise_distances, family_id
309 )
...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd/ks_distribution.py in _weighting(pairwise_estimates={'Ka': evm.model.Seq1.1226 ... e...... 0.0000
[119 rows x 119 columns], 'Ks': evm.model.Seq1.1226 ... e...... 0.0000
[119 rows x 119 columns], 'Omega': evm.model.Seq1.1226 ... e...... 0.0000
[119 rows x 119 columns]}, msa='/project/comparative_genomic_/wg...pipline/ks_tmp.3858b1da53b934/GF_000026.fasta.msa', method='phyml')
86 if method == 'phyml':
87 # PhyML tree construction
88 logging.debug('Constructing phylogenetic tree with PhyML')
89 tree_path = run_phyml(msa)
90 clustering, pairwise_distances = phylogenetic_tree_to_cluster_format(
---> 91 tree_path, pairwise_estimates['Ks'])
tree_path = '/project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026.fasta.msa.nw'
pairwise_estimates = {'Ka': evm.model.Seq1.1226 ... e...... 0.0000
[119 rows x 119 columns], 'Ks': evm.model.Seq1.1226 ... e...... 0.0000
[119 rows x 119 columns], 'Omega': evm.model.Seq1.1226 ... e...... 0.0000
[119 rows x 119 columns]}
92
93 elif method == 'fasttree':
94 # FastTree tree construction
95 logging.debug('Constructing phylogenetic tree with FastTree')
...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd/phy.py in phylogenetic_tree_to_cluster_format(tree='/project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026.fasta.msa.nw', pairwise_estimates= evm.model.Seq1.1226 ... e...... 0.0000
[119 rows x 119 columns])
118 (only the index is used)
119 :return: clustering data structure, pairwise distances dictionary
120 """
121 id_map = {
122 pairwise_estimates.index[i]: i for i in range(len(pairwise_estimates))}
--> 123 t = Tree(tree)
t = undefined
tree = '/project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026.fasta.msa.nw'
124
125 # midpoint rooting
126 midpoint = t.get_midpoint_outgroup()
127 if not midpoint: # midpoint = None when their are only two leaves
...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/ete3-3.1.1-py3.7.egg/ete3/coretype/tree.py in __init__(self=Tree node '' (0x2b936ceba5c), newick='/project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026.fasta.msa.nw', format=0, dist=None, support=None, name=None, quoted_node_names=False)
206
207 # Initialize tree
208 if newick is not None:
209 self._dist = 0.0
210 read_newick(newick, root_node = self, format=format,
--> 211 quoted_names=quoted_node_names)
quoted_node_names = False
212
213
214 def __nonzero__(self):
215 return True
...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/ete3-3.1.1-py3.7.egg/ete3/parser/newick.py in read_newick(newick='/project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026.fasta.msa.nw', root_node=Tree node '' (0x2b936ceba5c), format=0, quoted_names=False)
244 nw = nw.strip()
245 if not nw.startswith('(') and nw.endswith(';'):
246 #return _read_node_data(nw[:-1], root_node, "single", matcher, format)
247 return _read_newick_from_string(nw, root_node, matcher, format, quoted_names)
248 elif not nw.startswith('(') or not nw.endswith(';'):
--> 249 raise NewickError('Unexisting tree file or Malformed newick tree structure.')
250 else:
251 return _read_newick_from_string(nw, root_node, matcher, format, quoted_names)
252
253 else:
NewickError: Unexisting tree file or Malformed newick tree structure.
You may want to check other newick loading flags like 'format' or 'quoted_node_names'.
___________________________________________________________________________
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/software/wgd/wgd/wgd", line 11, in <module>
load_entry_point('wgd==1.1', 'console_scripts', 'wgd')()
File "/.local/lib/python3.7/site-packages/click/core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "/.local/lib/python3.7/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/.local/lib/python3.7/site-packages/click/core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/.local/lib/python3.7/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/.local/lib/python3.7/site-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd_cli.py", line 632, in ksd
File "/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd_cli.py", line 773, in ksd_
File "/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd/ks_distribution.py", line 645, in ks_analysis_paranome
File "/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/joblib-0.11-py3.7.egg/joblib/parallel.py", line 789, in __call__
self.retrieve()
File "/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/joblib-0.11-py3.7.egg/joblib/parallel.py", line 740, in retrieve
raise exception
joblib.my_exceptions.JoblibNewickError: JoblibNewickError
___________________________________________________________________________
Multiprocessing exception:
...........................................................................
/software/wgd/wgd/wgd in <module>()
6 from pkg_resources import load_entry_point
7
8 if __name__ == '__main__':
9 sys.argv[0] = re.sub(r'(-script\.pyw?|\.exe)?$', '', sys.argv[0])
10 sys.exit(
---> 11 load_entry_point('wgd==1.1', 'console_scripts', 'wgd')()
12 )
...........................................................................
/.local/lib/python3.7/site-packages/click/core.py in __call__(self=<click.core.Group object>, *args=(), **kwargs={})
759 echo('Aborted!', file=sys.stderr)
760 sys.exit(1)
761
762 def __call__(self, *args, **kwargs):
763 """Alias for :meth:`main`."""
--> 764 return self.main(*args, **kwargs)
self.main = <bound method BaseCommand.main of <click.core.Group object>>
args = ()
kwargs = {}
765
766
767 class Command(BaseCommand):
768 """Commands are the basic building block of command line interfaces in
...........................................................................
/.local/lib/python3.7/site-packages/click/core.py in main(self=<click.core.Group object>, args=['ksd', './_longest_blast/.longest.mRNA.cds.fasta.blast.tsv.mcl', './.longest.mRNA.cds.fasta', '-a', 'muscle', '-n', '20', '-w', 'phyml', '-o', './_longest_ksd'], prog_name='wgd', complete_var=None, standalone_mode=True, **extra={})
712 _bashcomplete(self, prog_name, complete_var)
713
714 try:
715 try:
716 with self.make_context(prog_name, args, **extra) as ctx:
--> 717 rv = self.invoke(ctx)
rv = undefined
self.invoke = <bound method MultiCommand.invoke of <click.core.Group object>>
ctx = <click.core.Context object>
718 if not standalone_mode:
719 return rv
720 # it's not safe to `ctx.exit(rv)` here!
721 # note that `rv` may actually contain data like "1" which
...........................................................................
/.local/lib/python3.7/site-packages/click/core.py in invoke(self=<click.core.Group object>, ctx=<click.core.Context object>)
1132 cmd_name, cmd, args = self.resolve_command(ctx, args)
1133 ctx.invoked_subcommand = cmd_name
1134 Command.invoke(self, ctx)
1135 sub_ctx = cmd.make_context(cmd_name, args, parent=ctx)
1136 with sub_ctx:
-> 1137 return _process_result(sub_ctx.command.invoke(sub_ctx))
_process_result = <function MultiCommand.invoke.<locals>._process_result>
sub_ctx.command.invoke = <bound method Command.invoke of <click.core.Command object>>
sub_ctx = <click.core.Context object>
1138
1139 # In chain mode we create the contexts step by step, but after the
1140 # base command has been invoked. Because at that point we do not
1141 # know the subcommands yet, the invoked subcommand attribute is
...........................................................................
/.local/lib/python3.7/site-packages/click/core.py in invoke(self=<click.core.Command object>, ctx=<click.core.Context object>)
951 """Given a context, this invokes the attached callback (if it exists)
952 in the right way.
953 """
954 _maybe_show_deprecated_notice(self)
955 if self.callback is not None:
--> 956 return ctx.invoke(self.callback, **ctx.params)
ctx.invoke = <bound method Context.invoke of <click.core.Context object>>
self.callback = <function ksd>
ctx.params = {'aligner': 'muscle', 'gene_families': './_longest_blast/.longest.mRNA.cds.fasta.blast.tsv.mcl', 'ignore_prefixes': False, 'max_pairwise': 10000, 'min_msa_length': 100, 'n_threads': 20, 'one_v_one': False, 'output_directory': './_longest_ksd', 'pairwise': False, 'preserve': False, ...}
957
958
959 class MultiCommand(Command):
960 """A multi command is the basic implementation of a command that
...........................................................................
/.local/lib/python3.7/site-packages/click/core.py in invoke(*args=(), **kwargs={'aligner': 'muscle', 'gene_families': './_longest_blast/.longest.mRNA.cds.fasta.blast.tsv.mcl', 'ignore_prefixes': False, 'max_pairwise': 10000, 'min_msa_length': 100, 'n_threads': 20, 'one_v_one': False, 'output_directory': './_longest_ksd', 'pairwise': False, 'preserve': False, ...})
550 kwargs[param.name] = param.get_default(ctx)
551
552 args = args[2:]
553 with augment_usage_errors(self):
554 with ctx:
--> 555 return callback(*args, **kwargs)
callback = <function ksd>
args = ()
kwargs = {'aligner': 'muscle', 'gene_families': './_longest_blast/.longest.mRNA.cds.fasta.blast.tsv.mcl', 'ignore_prefixes': False, 'max_pairwise': 10000, 'min_msa_length': 100, 'n_threads': 20, 'one_v_one': False, 'output_directory': './_longest_ksd', 'pairwise': False, 'preserve': False, ...}
556
557 def forward(*args, **kwargs):
558 """Similar to :meth:`invoke` but fills in default keyword
559 arguments from the current context if the other command expects
...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd_cli.py in ksd(gene_families='./_longest_blast/.longest.mRNA.cds.fasta.blast.tsv.mcl', sequences=('./.longest.mRNA.cds.fasta',), output_directory='./_longest_ksd', protein_sequences=None, tmp_dir=None, aligner='muscle', times=1, min_msa_length=100, n_threads=20, wm='phyml', pairwise=False, max_pairwise=10000, ignore_prefixes=False, one_v_one=False, preserve=False)
627 tmp_dir, aligner, codeml='codeml',
628 times=times, min_msa_length=min_msa_length,
629 ignore_prefixes=ignore_prefixes, one_v_one=one_v_one,
630 preserve=preserve, n_threads=n_threads,
631 weighting_method=wm, pairwise=pairwise,
--> 632 max_pairwise=max_pairwise
max_pairwise = 10000
633 )
634
635
636 def ksd_(
...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd_cli.py in ksd_(gene_families='/project/comparative_genomic_/wg..._blast/.longest.mRNA.cds.fasta.blast.tsv.mcl', sequences=('./.longest.mRNA.cds.fasta',), output_directory='/project/comparative_genomic_/wgd/wgd_pipline/_longest_ksd', protein_sequences=None, tmp_dir='/project/comparative_genomic_/wgd/wgd_pipline/ks_tmp.3858b1da53b934', aligner='muscle', codeml='codeml', times=1, min_msa_length=100, ignore_prefixes=False, one_v_one=False, pairwise=False, preserve=False, n_threads=20, weighting_method='phyml', max_pairwise=10000)
768 ignore_prefixes=ignore_prefixes,
769 n_threads=n_threads,
770 min_length=min_msa_length,
771 method=weighting_method,
772 pairwise=pairwise,
--> 773 max_pairwise=max_pairwise,
max_pairwise = 10000
774 )
775 results.round(5).to_csv(os.path.join(
776 output_directory, '{}.ks.tsv'.format(base)), sep='\t')
777
...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd/ks_distribution.py in ks_analysis_paranome(nucleotide_sequences={'evm.model.Seq1.1': 'ATGTGGGGTTTGGCTAATTTTGAAAGATCTGATATCTCTCTCTCTTTA...TTGGAACTTGGAAAGCAAGGAATACGTTTTGTATCTCAGAAGCAAAAAC', 'evm.model.Seq1.10': 'ATGGAAAGCAAGAGTGGAGAAGGAAAGGTTGTATGTGTAACAGGGGCA...GAAGGATACTGTCGAAAGCTTGATGGAGAAGAACTTCCTCCATATCTAA', 'evm.model.Seq1.100': 'ATGGTACTCTTGGTTGAAAAGATTAGCCATTTTCTGAAAAATCCCAAC...AGCTGAAAGAGGTAAATTTCTTGATAGCCTTCAGCACACTAGAAAGTAG', 'evm.model.Seq1.1000': 'ATGGATGAAGCAAAGGTTGTTGAAGCTAAGGAGGGAACTATCTCTGTA...AGCCGAGCAAGTATTTGAGAAGACCAAAGAGAAGTTCCCCATCTATTGA', 'evm.model.Seq1.1001': 'ATGGCGGACAAGGCAGTCACCATCCGTACTCGCAAGTTCATGACCAAC...GATCCGTGGTGTAAAGAAGACCAAGGCTGGTGATGCAAAGAAGAAATAA', 'evm.model.Seq1.1002.2.5dee1a2d': 'ATGACTGCTGCTCCTTTCTTGATTGAATCAAACCTCAAATATAACCCT...TCATCAGGAAGCACTAAGTTTTGCGCTACTTGTAGCTTTTAATTTGTGA', 'evm.model.Seq1.1003': 'ATGAATGTTGATCAGCATGGTTCCAGCTCTAGGTTATATGTAAGTTTG...AACCATAGCTATGTTGATGGAACAGGTTTTAAGAGTGGCAACACTTTAA', 'evm.model.Seq1.1004': 'ATGAATGGTCTTACACATACAGAACCAGAGTTTTCTGAATTTGTTGAA...TTCACTTCACAGAACAACATCCTTACCAGTTGATGCTGTTGATATATAG', 'evm.model.Seq1.1005': 'ATGATTCCTGCTTGTTTTAGTATTCCTCATTCTGAGGTTTCAAAAACT...GATTAGTGAGGGATTTTCTTTGTTATTGTATGCGTGGAGGAAGGATTGA', 'evm.model.Seq1.1006.2.5dee1a2d.1.5df2f16e': 'ATGAGAGTGGAGATATCTGATGATGAAGGGGCTGAAGAACCTTTGGTC...AGCCGTTGGTCAGATATTAATGCCTGGTTTGAGGAGCTCCGGTGAGTGA', ...}, protein_sequences={'evm.model.Seq1.1': 'MWGLANFERSDISLSLFIKPVSWFCCFLTKLSAFWCSRRICYCSLGLP...HPSDNSWHRGTVIEVFEGSSVVSVALDDGKKKNLELGKQGIRFVSQKQK', 'evm.model.Seq1.10': 'MESKSGEGKVVCVTGASGFIASWLVKLLLQRGYIVNATVRNLKDTSKV...ENFEDGLPLTPHFQVSSERAKCLGVKFTSLELSVKDTVESLMEKNFLHI', 'evm.model.Seq1.100': 'MVLLVEKISHFLKNPNRLENHHHNSEALLASSLQGFRSDVSKILNKVL...VDEVVKELRRRLKDLEELLQTIGKKTNGLFSEVLAERGKFLDSLQHTRK', 'evm.model.Seq1.1000': 'MDEAKVVEAKEGTISVATAFAGHQEAVRDRDHKFLTQAVEEAYKGVES...IGFDDFIADALRGTGFYQKAQLEIKQADGKGALIAEQVFEKTKEKFPIY', 'evm.model.Seq1.1001': 'MADKAVTIRTRKFMTNRLLARKQFVIDVLHPGRANVSKAELKEKLARM...KKYEPKYRLIRNGLDTKVEKSRKQMKERKNRAKKIRGVKKTKAGDAKKK', 'evm.model.Seq1.1002.2.5dee1a2d': 'MTAAPFLIESNLKYNPLLYTPNPIQYTRLLHNQKLTPSKLSKPTKLTV...YKLPTINGSGDLKEALQKIASIPSSRTLVSKRNGHQEALSFALLVAFNL', 'evm.model.Seq1.1003': 'MNVDQHGSSSRLYVSLKERIVKVQSAAANSSGAASPIIDEDLRESPIDNDIDEDPWKPPMDNDFPNNVQEMIWRILLPATIAMLMEQVLRVATL', 'evm.model.Seq1.1004': 'MNGLTHTEPEFSEFVEVDPTGRYGRYNEILGKGASKTVYRAFDEYEGI...SQRARKCEAIKGSPNVRDMVSTAKSFFTRTLLPNSLHRTTSLPVDAVDI', 'evm.model.Seq1.1005': 'MIPACFSIPHSEVSKTSSSPPSQVPQNLVTCIYQAHICGSPVYLTLTW...VSLLSSPSCSSVLQWAEESSECGRSSWSSMRSSEISEGFSLLLYAWRKD', 'evm.model.Seq1.1006.2.5dee1a2d.1.5df2f16e': 'MRVEISDDEGAEEPLVSNVDDVLKIIKSDYEKSYFVTGLFTSRIYAED...WRPLISVDGKTVYDLDEKLKIVKHVESWNISAFEAVGQILMPGLRSSGE', ...}, paralogs={'GF_000001': ['evm.model.Seq1.8294', 'evm.model.Seq3.747', 'evm.model.Seq11.759', 'evm.model.Seq3.973', 'evm.model.Seq6.3827', 'evm.model.Seq8.1981', 'evm.model.Seq7.3918', 'evm.model.Seq3.968', 'evm.model.Seq3.4625', 'evm.model.Seq7.4762', 'evm.model.Seq3.1299', 'evm.model.Seq6.1553', 'evm.model.Seq12.297', 'evm.model.Seq12.294', 'evm.model.Seq8.3726', 'evm.model.Seq12.2981', 'evm.model.Seq12.1584', 'evm.model.Seq12.2932', 'evm.model.Seq1.1275', 'evm.model.Seq9.3520', ...], 'GF_000002': ['evm.model.Seq10.2013', 'evm.model.Seq3.4200', 'evm.model.Seq9.2451', 'evm.model.Seq6.4919', 'evm.model.Seq6.2314', 'evm.model.Seq10.248', 'evm.model.Seq1.879', 'evm.model.Seq3.2753', 'evm.model.Seq5.3091', 'evm.model.Seq11.1052', 'evm.model.Seq9.321', 'evm.model.Seq2.1529', 'evm.model.Seq6.3951', 'evm.model.Seq3.3617', 'evm.model.Seq8.856', 'evm.model.Seq12.1958', 'evm.model.Seq11.845', 'evm.model.Seq5.2780', 'evm.model.Seq5.2776', 'evm.model.Seq6.1124', ...], 'GF_000003': ['evm.model.Seq5.2371', 'evm.model.Seq7.4835', 'evm.model.Seq4.444', 'evm.model.Seq4.968', 'evm.model.Seq9.2671', 'evm.model.Seq12.14', 'evm.model.Seq11.975', 'evm.model.Seq8.3209', 'evm.model.Seq3.2590', 'evm.model.Seq10.215', 'evm.model.Seq7.1965', 'evm.model.Seq6.2477', 'evm.model.Seq3.1843', 'evm.model.Seq10.326', 'evm.model.Seq8.2926', 'evm.model.Seq9.3554', 'evm.model.Seq5.1790', 'evm.model.Seq10.214', 'evm.model.Seq1.5216', 'evm.model.Seq8.2800', ...], 'GF_000004': ['evm.model.Seq4.117', 'evm.model.Seq4.119', 'evm.model.Seq10.263', 'evm.model.Seq8.2836', 'evm.model.Seq8.3722', 'evm.model.Seq8.3275', 'evm.model.Seq2.3139', 'evm.model.Seq9.3433', 'evm.model.Seq10.997', 'evm.model.Seq10.222', 'evm.model.Seq5.3282', 'evm.model.Seq7.1126', 'evm.model.Seq3.2258', 'evm.model.Seq3.2392', 'evm.model.Seq3.2952', 'evm.model.Seq10.661', 'evm.model.Seq10.962', 'evm.model.Seq5.2181', 'evm.model.Seq8.4192', 'evm.model.Seq7.939', ...], 'GF_000005': ['evm.model.Seq4.142', 'evm.model.Seq12.2076', 'evm.model.Seq4.777', 'evm.model.Seq4.942', 'evm.model.Seq8.3587', 'evm.model.Seq2.3451', 'evm.model.Seq4.4280', 'evm.model.Seq5.3422', 'evm.model.Seq10.2642', 'evm.model.Seq8.1232', 'evm.model.Seq7.1986', 'evm.model.Seq7.4403', 'evm.model.Seq1.1175.1.5dee19bd', 'evm.model.Seq4.4315', 'evm.model.Seq8.4788', 'evm.model.Seq2.3446', 'evm.model.Seq10.1699', 'evm.model.Seq4.630', 'evm.model.Seq8.3738', 'evm.model.Seq8.3412', ...], 'GF_000006': ['evm.model.Seq4.1211', 'evm.model.Seq1.68', 'evm.model.Seq7.143', 'evm.model.Seq5.3097', 'evm.model.Seq9.2047', 'evm.model.Seq5.5039', 'evm.model.Seq12.351', 'evm.model.Seq2.3909', 'evm.model.Seq2.2985', 'evm.model.Seq1.1130', 'evm.model.Seq1.5300', 'evm.model.Seq7.5023', 'evm.model.Seq1.1113', 'evm.model.Seq1.507', 'evm.model.Seq6.2682', 'evm.model.Seq5.2469', 'evm.model.Seq5.1744', 'evm.model.Seq1.7216', 'evm.model.Seq1.6709', 'evm.model.Seq6.1256', ...], 'GF_000007': ['evm.model.Seq8.4514', 'evm.model.Seq3.876', 'evm.model.Seq1.3868', 'evm.model.Seq8.2284', 'evm.model.Seq6.3859', 'evm.model.Seq11.1873', 'evm.model.Seq10.935', 'evm.model.Seq10.941', 'evm.model.Seq5.1583', 'evm.model.Seq5.1427', 'evm.model.Seq11.1718', 'evm.model.Seq4.2019', 'evm.model.Seq11.1270', 'evm.model.Seq5.2502', 'evm.model.Seq1.3832', 'evm.model.Seq6.4171', 'evm.model.Seq3.793', 'evm.model.Seq9.3163', 'evm.model.Seq3.1484', 'evm.model.Seq2.2035', ...], 'GF_000008': ['evm.model.Seq2.6037', 'evm.model.Seq2.5984', 'evm.model.Seq8.5170', 'evm.model.Seq2.2960', 'evm.model.Seq5.975', 'evm.model.Seq12.2519', 'evm.model.Seq11.2174', 'evm.model.Seq9.2679', 'evm.model.Seq2.6035', 'evm.model.Seq9.4017', 'evm.model.Seq8.5167', 'evm.model.Seq3.200', 'evm.model.Seq7.3340', 'evm.model.Seq12.2574', 'evm.model.Seq12.3595', 'evm.model.Seq5.930', 'evm.model.Seq12.3116', 'evm.model.Seq12.3207', 'evm.model.Seq12.902', 'evm.model.Seq7.6456', ...], 'GF_000009': ['evm.model.Seq12.647', 'evm.model.Seq9.2418', 'evm.model.Seq12.633', 'evm.model.Seq6.2786', 'evm.model.Seq9.2640', 'evm.model.Seq9.2435', 'evm.model.Seq2.2515', 'evm.model.Seq12.130', 'evm.model.Seq11.416', 'evm.model.Seq9.4139', 'evm.model.Seq9.4099', 'evm.model.Seq11.216', 'evm.model.Seq4.4661', 'evm.model.Seq9.3589', 'evm.model.Seq10.477', 'evm.model.Seq1.474', 'evm.model.Seq6.4340', 'evm.model.Seq6.3176', 'evm.model.Seq3.1058', 'evm.model.Seq12.322', ...], 'GF_000010': ['evm.model.Seq8.142', 'evm.model.Seq6.655', 'evm.model.Seq1.1336', 'evm.model.Seq1.8641', 'evm.model.Seq1.736', 'evm.model.Seq6.4210', 'evm.model.Seq1.4691', 'evm.model.Seq3.4496', 'evm.model.Seq10.2951', 'evm.model.Seq1.5450', 'evm.model.Seq9.4437', 'evm.model.Seq4.730', 'evm.model.Seq8.2624', 'evm.model.Seq1.486', 'evm.model.Seq12.504.1.5dee19aa', 'evm.model.Seq2.2930', 'evm.model.Seq2.3790', 'evm.model.Seq1.5849', 'evm.model.Seq7.3293', 'evm.model.Seq4.4437', ...], ...}, tmp_dir='/project/comparative_genomic_/wgd/wgd_pipline/ks_tmp.3858b1da53b934', output_dir='/project/comparative_genomic_/wgd/wgd_pipline/_longest_ksd', codeml_path='codeml', preserve=False, times=1, ignore_prefixes=False, n_threads=20, min_length=100, method='phyml', aligner='muscle', pairwise=False, max_pairwise=10000)
640
641 Parallel(n_jobs=n_threads)(delayed(analysis_function)(
642 family[0], protein[family[0]], nucleotide_sequences, tmp_dir,
643 codeml_path, preserve, times, min_length, method, aligner,
644 output_dir
--> 645 ) for family in sorted_families)
sorted_families = [('GF_000017', 140), ('GF_000018', 137), ('GF_000019', 136), ('GF_000020', 132), ('GF_000021', 129), ('GF_000022', 127), ('GF_000023', 126), ('GF_000024', 121), ('GF_000025', 119), ('GF_000026', 119), ('GF_000027', 118), ('GF_000028', 116), ('GF_000029', 116), ('GF_000030', 115), ('GF_000031', 114), ('GF_000032', 114), ('GF_000033', 111), ('GF_000034', 108), ('GF_000035', 105), ('GF_000036', 102), ...]
646 logging.info('Analysis done')
647
648 logging.info('Making results data frame')
649 results_frame = pd.DataFrame(
...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/joblib-0.11-py3.7.egg/joblib/parallel.py in __call__(self=Parallel(n_jobs=20), iterable=<generator object ks_analysis_paranome.<locals>.<genexpr>>)
784 if pre_dispatch == "all" or n_jobs == 1:
785 # The iterable was consumed all at once by the above for loop.
786 # No need to wait for async callbacks to trigger to
787 # consumption.
788 self._iterating = False
--> 789 self.retrieve()
self.retrieve = <bound method Parallel.retrieve of Parallel(n_jobs=20)>
790 # Make sure that we get a last message telling us we are done
791 elapsed_time = time.time() - self._start_time
792 self._print('Done %3i out of %3i | elapsed: %s finished',
793 (len(self._output), len(self._output),
---------------------------------------------------------------------------
Sub-process traceback:
---------------------------------------------------------------------------
NewickError Sat Apr 4 23:42:21 2020
PID: 27129Python 3.7.3: /software/Python-3.7.3/sxh_configure/bin/python3
...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/joblib-0.11-py3.7.egg/joblib/parallel.py in __call__(self=<joblib.parallel.BatchedCalls object>)
126 def __init__(self, iterator_slice):
127 self.items = list(iterator_slice)
128 self._size = len(self.items)
129
130 def __call__(self):
--> 131 return [func(*args, **kwargs) for func, args, kwargs in self.items]
self.items = [(<function analyse_family>, ('GF_000026', {'evm.model.Seq1.1226': 'MGGGERSLLHLPHLHIHQGKKKTSDVPKGYLAIKVGQEEEEQQRFVVP...KEAEEVYGFHHKGTITIPCHVEQFRSIQGKIDKHHHNHHHHIHVPCFRA', 'evm.model.Seq1.1709': 'MRKFRGFVLKHRVTTLFRCMFRQRRRETARYHRLDQLPSWNGPTKSFS...IYINHPLFSELLREAEEEYGFNHPGGITIPCRISEFERVQTRIKQCRVG', 'evm.model.Seq1.1906': 'MGGGERSLLHLPHLHIHQGKKKTSDVPKGYLAIKVGQEEEEQQRFVVP...KEAEEVYGFHHKGTITIPCHVEQFRSIQGKIDKHHHNHHHHIHVPCFRA', 'evm.model.Seq1.3716': 'MSSKIGKSSKIRCIVRISQMLRQWKKRSLISSSKRIAPDVPAGHVAIS...LRFVSRSGSGRSINIEDFQKSCHARYRNSVENFGDSRPLLGGSTEKSVC', 'evm.model.Seq1.3717': 'MAIRKSNKMPQAAILKQILKRCSSLGKKHGYDDEDHLPNDVPKGHFAV...YIVPISFLTHPEFQCLLRQAEEEFGFDHDMGITIPCEEVVFRSLTSMLR', 'evm.model.Seq1.3718': 'MGIRKSNKLPQVILFKQIMKRCSRLAKKQSYGDVPKGHFAVYVGENRT...HPEFQCLLRCAEEEFGFDHDMGITIPCEEFIFQSMTSMLRYEKKKKSEY', 'evm.model.Seq1.3722': 'MAIRKSNKLPQVVLLKKILKRCSSLTKKHGYGDLDHIPNDVPKGHFAV...YIVPISFLNHPEFQCLLRCAEEEFGFDHDMGITIPCEEVIFQSLTSMLR', 'evm.model.Seq1.3723': 'MGIRKSNKLSQAAVLKQILSCSSLGMKQGYDDEFHLPIDVPKGHFAVY...FIVPISFLTHPEFQCLLQRAEEEFGFNHDMGITIPCEEAVFRSLTTMLR', 'evm.model.Seq1.3724': 'MAIRKSNKLPQAPILKQILKRCSSLGKKHVYDDDEDHLPVDVPKGYFT...YIIPISFLTHPDFQCLLRCAEEEFGFDHDMGITIPCEELVFQSLTSMIR', 'evm.model.Seq1.3725': 'MAIRKSNKLTQGAVVKQIIKRCSSLGKKHGYDDGDHLPMDVPKGHFAV...FWTHPEFQCLLHCAEEEFGFDHDMGITIPCEEVVFPITNFHTLVVEVSD', ...}, {'evm.model.Seq1.1': 'ATGTGGGGTTTGGCTAATTTTGAAAGATCTGATATCTCTCTCTCTTTA...TTGGAACTTGGAAAGCAAGGAATACGTTTTGTATCTCAGAAGCAAAAAC', 'evm.model.Seq1.10': 'ATGGAAAGCAAGAGTGGAGAAGGAAAGGTTGTATGTGTAACAGGGGCA...GAAGGATACTGTCGAAAGCTTGATGGAGAAGAACTTCCTCCATATCTAA', 'evm.model.Seq1.100': 'ATGGTACTCTTGGTTGAAAAGATTAGCCATTTTCTGAAAAATCCCAAC...AGCTGAAAGAGGTAAATTTCTTGATAGCCTTCAGCACACTAGAAAGTAG', 'evm.model.Seq1.1000': 'ATGGATGAAGCAAAGGTTGTTGAAGCTAAGGAGGGAACTATCTCTGTA...AGCCGAGCAAGTATTTGAGAAGACCAAAGAGAAGTTCCCCATCTATTGA', 'evm.model.Seq1.1001': 'ATGGCGGACAAGGCAGTCACCATCCGTACTCGCAAGTTCATGACCAAC...GATCCGTGGTGTAAAGAAGACCAAGGCTGGTGATGCAAAGAAGAAATAA', 'evm.model.Seq1.1002.2.5dee1a2d': 'ATGACTGCTGCTCCTTTCTTGATTGAATCAAACCTCAAATATAACCCT...TCATCAGGAAGCACTAAGTTTTGCGCTACTTGTAGCTTTTAATTTGTGA', 'evm.model.Seq1.1003': 'ATGAATGTTGATCAGCATGGTTCCAGCTCTAGGTTATATGTAAGTTTG...AACCATAGCTATGTTGATGGAACAGGTTTTAAGAGTGGCAACACTTTAA', 'evm.model.Seq1.1004': 'ATGAATGGTCTTACACATACAGAACCAGAGTTTTCTGAATTTGTTGAA...TTCACTTCACAGAACAACATCCTTACCAGTTGATGCTGTTGATATATAG', 'evm.model.Seq1.1005': 'ATGATTCCTGCTTGTTTTAGTATTCCTCATTCTGAGGTTTCAAAAACT...GATTAGTGAGGGATTTTCTTTGTTATTGTATGCGTGGAGGAAGGATTGA', 'evm.model.Seq1.1006.2.5dee1a2d.1.5df2f16e': 'ATGAGAGTGGAGATATCTGATGATGAAGGGGCTGAAGAACCTTTGGTC...AGCCGTTGGTCAGATATTAATGCCTGGTTTGAGGAGCTCCGGTGAGTGA', ...}, '/project/comparative_genomic_/wgd/wgd_pipline/ks_tmp.3858b1da53b934', 'codeml', False, 1, 100, 'phyml', 'muscle', '/project/comparative_genomic_/wgd/wgd_pipline/_longest_ksd'), {})]
132
133 def __len__(self):
134 return self._size
135
...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/joblib-0.11-py3.7.egg/joblib/parallel.py in <listcomp>(.0=<list_iterator object>)
126 def __init__(self, iterator_slice):
127 self.items = list(iterator_slice)
128 self._size = len(self.items)
129
130 def __call__(self):
--> 131 return [func(*args, **kwargs) for func, args, kwargs in self.items]
func = <function analyse_family>
args = ('GF_000026', {'evm.model.Seq1.1226': 'MGGGERSLLHLPHLHIHQGKKKTSDVPKGYLAIKVGQEEEEQQRFVVP...KEAEEVYGFHHKGTITIPCHVEQFRSIQGKIDKHHHNHHHHIHVPCFRA', 'evm.model.Seq1.1709': 'MRKFRGFVLKHRVTTLFRCMFRQRRRETARYHRLDQLPSWNGPTKSFS...IYINHPLFSELLREAEEEYGFNHPGGITIPCRISEFERVQTRIKQCRVG', 'evm.model.Seq1.1906': 'MGGGERSLLHLPHLHIHQGKKKTSDVPKGYLAIKVGQEEEEQQRFVVP...KEAEEVYGFHHKGTITIPCHVEQFRSIQGKIDKHHHNHHHHIHVPCFRA', 'evm.model.Seq1.3716': 'MSSKIGKSSKIRCIVRISQMLRQWKKRSLISSSKRIAPDVPAGHVAIS...LRFVSRSGSGRSINIEDFQKSCHARYRNSVENFGDSRPLLGGSTEKSVC', 'evm.model.Seq1.3717': 'MAIRKSNKMPQAAILKQILKRCSSLGKKHGYDDEDHLPNDVPKGHFAV...YIVPISFLTHPEFQCLLRQAEEEFGFDHDMGITIPCEEVVFRSLTSMLR', 'evm.model.Seq1.3718': 'MGIRKSNKLPQVILFKQIMKRCSRLAKKQSYGDVPKGHFAVYVGENRT...HPEFQCLLRCAEEEFGFDHDMGITIPCEEFIFQSMTSMLRYEKKKKSEY', 'evm.model.Seq1.3722': 'MAIRKSNKLPQVVLLKKILKRCSSLTKKHGYGDLDHIPNDVPKGHFAV...YIVPISFLNHPEFQCLLRCAEEEFGFDHDMGITIPCEEVIFQSLTSMLR', 'evm.model.Seq1.3723': 'MGIRKSNKLSQAAVLKQILSCSSLGMKQGYDDEFHLPIDVPKGHFAVY...FIVPISFLTHPEFQCLLQRAEEEFGFNHDMGITIPCEEAVFRSLTTMLR', 'evm.model.Seq1.3724': 'MAIRKSNKLPQAPILKQILKRCSSLGKKHVYDDDEDHLPVDVPKGYFT...YIIPISFLTHPDFQCLLRCAEEEFGFDHDMGITIPCEELVFQSLTSMIR', 'evm.model.Seq1.3725': 'MAIRKSNKLTQGAVVKQIIKRCSSLGKKHGYDDGDHLPMDVPKGHFAV...FWTHPEFQCLLHCAEEEFGFDHDMGITIPCEEVVFPITNFHTLVVEVSD', ...}, {'evm.model.Seq1.1': 'ATGTGGGGTTTGGCTAATTTTGAAAGATCTGATATCTCTCTCTCTTTA...TTGGAACTTGGAAAGCAAGGAATACGTTTTGTATCTCAGAAGCAAAAAC', 'evm.model.Seq1.10': 'ATGGAAAGCAAGAGTGGAGAAGGAAAGGTTGTATGTGTAACAGGGGCA...GAAGGATACTGTCGAAAGCTTGATGGAGAAGAACTTCCTCCATATCTAA', 'evm.model.Seq1.100': 'ATGGTACTCTTGGTTGAAAAGATTAGCCATTTTCTGAAAAATCCCAAC...AGCTGAAAGAGGTAAATTTCTTGATAGCCTTCAGCACACTAGAAAGTAG', 'evm.model.Seq1.1000': 'ATGGATGAAGCAAAGGTTGTTGAAGCTAAGGAGGGAACTATCTCTGTA...AGCCGAGCAAGTATTTGAGAAGACCAAAGAGAAGTTCCCCATCTATTGA', 'evm.model.Seq1.1001': 'ATGGCGGACAAGGCAGTCACCATCCGTACTCGCAAGTTCATGACCAAC...GATCCGTGGTGTAAAGAAGACCAAGGCTGGTGATGCAAAGAAGAAATAA', 'evm.model.Seq1.1002.2.5dee1a2d': 'ATGACTGCTGCTCCTTTCTTGATTGAATCAAACCTCAAATATAACCCT...TCATCAGGAAGCACTAAGTTTTGCGCTACTTGTAGCTTTTAATTTGTGA', 'evm.model.Seq1.1003': 'ATGAATGTTGATCAGCATGGTTCCAGCTCTAGGTTATATGTAAGTTTG...AACCATAGCTATGTTGATGGAACAGGTTTTAAGAGTGGCAACACTTTAA', 'evm.model.Seq1.1004': 'ATGAATGGTCTTACACATACAGAACCAGAGTTTTCTGAATTTGTTGAA...TTCACTTCACAGAACAACATCCTTACCAGTTGATGCTGTTGATATATAG', 'evm.model.Seq1.1005': 'ATGATTCCTGCTTGTTTTAGTATTCCTCATTCTGAGGTTTCAAAAACT...GATTAGTGAGGGATTTTCTTTGTTATTGTATGCGTGGAGGAAGGATTGA', 'evm.model.Seq1.1006.2.5dee1a2d.1.5df2f16e': 'ATGAGAGTGGAGATATCTGATGATGAAGGGGCTGAAGAACCTTTGGTC...AGCCGTTGGTCAGATATTAATGCCTGGTTTGAGGAGCTCCGGTGAGTGA', ...}, '/project/comparative_genomic_/wgd/wgd_pipline/ks_tmp.3858b1da53b934', 'codeml', False, 1, 100, 'phyml', 'muscle', '/project/comparative_genomic_/wgd/wgd_pipline/_longest_ksd')
kwargs = {}
132
133 def __len__(self):
134 return self._size
135
...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd/ks_distribution.py in analyse_family(family_id='GF_000026', family={'evm.model.Seq1.1226': 'MGGGERSLLHLPHLHIHQGKKKTSDVPKGYLAIKVGQEEEEQQRFVVP...KEAEEVYGFHHKGTITIPCHVEQFRSIQGKIDKHHHNHHHHIHVPCFRA', 'evm.model.Seq1.1709': 'MRKFRGFVLKHRVTTLFRCMFRQRRRETARYHRLDQLPSWNGPTKSFS...IYINHPLFSELLREAEEEYGFNHPGGITIPCRISEFERVQTRIKQCRVG', 'evm.model.Seq1.1906': 'MGGGERSLLHLPHLHIHQGKKKTSDVPKGYLAIKVGQEEEEQQRFVVP...KEAEEVYGFHHKGTITIPCHVEQFRSIQGKIDKHHHNHHHHIHVPCFRA', 'evm.model.Seq1.3716': 'MSSKIGKSSKIRCIVRISQMLRQWKKRSLISSSKRIAPDVPAGHVAIS...LRFVSRSGSGRSINIEDFQKSCHARYRNSVENFGDSRPLLGGSTEKSVC', 'evm.model.Seq1.3717': 'MAIRKSNKMPQAAILKQILKRCSSLGKKHGYDDEDHLPNDVPKGHFAV...YIVPISFLTHPEFQCLLRQAEEEFGFDHDMGITIPCEEVVFRSLTSMLR', 'evm.model.Seq1.3718': 'MGIRKSNKLPQVILFKQIMKRCSRLAKKQSYGDVPKGHFAVYVGENRT...HPEFQCLLRCAEEEFGFDHDMGITIPCEEFIFQSMTSMLRYEKKKKSEY', 'evm.model.Seq1.3722': 'MAIRKSNKLPQVVLLKKILKRCSSLTKKHGYGDLDHIPNDVPKGHFAV...YIVPISFLNHPEFQCLLRCAEEEFGFDHDMGITIPCEEVIFQSLTSMLR', 'evm.model.Seq1.3723': 'MGIRKSNKLSQAAVLKQILSCSSLGMKQGYDDEFHLPIDVPKGHFAVY...FIVPISFLTHPEFQCLLQRAEEEFGFNHDMGITIPCEEAVFRSLTTMLR', 'evm.model.Seq1.3724': 'MAIRKSNKLPQAPILKQILKRCSSLGKKHVYDDDEDHLPVDVPKGYFT...YIIPISFLTHPDFQCLLRCAEEEFGFDHDMGITIPCEELVFQSLTSMIR', 'evm.model.Seq1.3725': 'MAIRKSNKLTQGAVVKQIIKRCSSLGKKHGYDDGDHLPMDVPKGHFAV...FWTHPEFQCLLHCAEEEFGFDHDMGITIPCEEVVFPITNFHTLVVEVSD', ...}, nucleotide={'evm.model.Seq1.1': 'ATGTGGGGTTTGGCTAATTTTGAAAGATCTGATATCTCTCTCTCTTTA...TTGGAACTTGGAAAGCAAGGAATACGTTTTGTATCTCAGAAGCAAAAAC', 'evm.model.Seq1.10': 'ATGGAAAGCAAGAGTGGAGAAGGAAAGGTTGTATGTGTAACAGGGGCA...GAAGGATACTGTCGAAAGCTTGATGGAGAAGAACTTCCTCCATATCTAA', 'evm.model.Seq1.100': 'ATGGTACTCTTGGTTGAAAAGATTAGCCATTTTCTGAAAAATCCCAAC...AGCTGAAAGAGGTAAATTTCTTGATAGCCTTCAGCACACTAGAAAGTAG', 'evm.model.Seq1.1000': 'ATGGATGAAGCAAAGGTTGTTGAAGCTAAGGAGGGAACTATCTCTGTA...AGCCGAGCAAGTATTTGAGAAGACCAAAGAGAAGTTCCCCATCTATTGA', 'evm.model.Seq1.1001': 'ATGGCGGACAAGGCAGTCACCATCCGTACTCGCAAGTTCATGACCAAC...GATCCGTGGTGTAAAGAAGACCAAGGCTGGTGATGCAAAGAAGAAATAA', 'evm.model.Seq1.1002.2.5dee1a2d': 'ATGACTGCTGCTCCTTTCTTGATTGAATCAAACCTCAAATATAACCCT...TCATCAGGAAGCACTAAGTTTTGCGCTACTTGTAGCTTTTAATTTGTGA', 'evm.model.Seq1.1003': 'ATGAATGTTGATCAGCATGGTTCCAGCTCTAGGTTATATGTAAGTTTG...AACCATAGCTATGTTGATGGAACAGGTTTTAAGAGTGGCAACACTTTAA', 'evm.model.Seq1.1004': 'ATGAATGGTCTTACACATACAGAACCAGAGTTTTCTGAATTTGTTGAA...TTCACTTCACAGAACAACATCCTTACCAGTTGATGCTGTTGATATATAG', 'evm.model.Seq1.1005': 'ATGATTCCTGCTTGTTTTAGTATTCCTCATTCTGAGGTTTCAAAAACT...GATTAGTGAGGGATTTTCTTTGTTATTGTATGCGTGGAGGAAGGATTGA', 'evm.model.Seq1.1006.2.5dee1a2d.1.5df2f16e': 'ATGAGAGTGGAGATATCTGATGATGAAGGGGCTGAAGAACCTTTGGTC...AGCCGTTGGTCAGATATTAATGCCTGGTTTGAGGAGCTCCGGTGAGTGA', ...}, tmp='/project/comparative_genomic_/wgd/wgd_pipline/ks_tmp.3858b1da53b934', codeml=<wgd.codeml.Codeml object>, preserve=False, times=1, min_length=100, method='phyml', aligner='muscle', output_dir='/project/comparative_genomic_/wgd/wgd_pipline/_longest_ksd')
300 logging.debug("Distance will be in Ks units!")
301 clustering, pairwise_distances, tree_path = _weighting(
302 results_dict, msa=msa_path_protein, method="alc")
303 else:
304 clustering, pairwise_distances, tree_path = _weighting(
--> 305 results_dict, msa=msa_path_protein, method=method)
results_dict = {'Ka': evm.model.Seq1.1226 ... e...... 0.0000
[119 rows x 119 columns], 'Ks': evm.model.Seq1.1226 ... e...... 0.0000
[119 rows x 119 columns], 'Omega': evm.model.Seq1.1226 ... e...... 0.0000
[119 rows x 119 columns]}
msa_path_protein = '/project/comparative_genomic_/wg...pipline/ks_tmp.3858b1da53b934/GF_000026.fasta.msa'
method = 'phyml'
306 if clustering is not None:
307 out = _calculate_weighted_ks(
308 clustering, results_dict, pairwise_distances, family_id
309 )
...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd/ks_distribution.py in _weighting(pairwise_estimates={'Ka': evm.model.Seq1.1226 ... e...... 0.0000
[119 rows x 119 columns], 'Ks': evm.model.Seq1.1226 ... e...... 0.0000
[119 rows x 119 columns], 'Omega': evm.model.Seq1.1226 ... e...... 0.0000
[119 rows x 119 columns]}, msa='/project/comparative_genomic_/wg...pipline/ks_tmp.3858b1da53b934/GF_000026.fasta.msa', method='phyml')
86 if method == 'phyml':
87 # PhyML tree construction
88 logging.debug('Constructing phylogenetic tree with PhyML')
89 tree_path = run_phyml(msa)
90 clustering, pairwise_distances = phylogenetic_tree_to_cluster_format(
---> 91 tree_path, pairwise_estimates['Ks'])
tree_path = '/project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026.fasta.msa.nw'
pairwise_estimates = {'Ka': evm.model.Seq1.1226 ... e...... 0.0000
[119 rows x 119 columns], 'Ks': evm.model.Seq1.1226 ... e...... 0.0000
[119 rows x 119 columns], 'Omega': evm.model.Seq1.1226 ... e...... 0.0000
[119 rows x 119 columns]}
92
93 elif method == 'fasttree':
94 # FastTree tree construction
95 logging.debug('Constructing phylogenetic tree with FastTree')
...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd/phy.py in phylogenetic_tree_to_cluster_format(tree='/project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026.fasta.msa.nw', pairwise_estimates= evm.model.Seq1.1226 ... e...... 0.0000
[119 rows x 119 columns])
118 (only the index is used)
119 :return: clustering data structure, pairwise distances dictionary
120 """
121 id_map = {
122 pairwise_estimates.index[i]: i for i in range(len(pairwise_estimates))}
--> 123 t = Tree(tree)
t = undefined
tree = '/project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026.fasta.msa.nw'
124
125 # midpoint rooting
126 midpoint = t.get_midpoint_outgroup()
127 if not midpoint: # midpoint = None when their are only two leaves
...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/ete3-3.1.1-py3.7.egg/ete3/coretype/tree.py in __init__(self=Tree node '' (0x2b936ceba5c), newick='/project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026.fasta.msa.nw', format=0, dist=None, support=None, name=None, quoted_node_names=False)
206
207 # Initialize tree
208 if newick is not None:
209 self._dist = 0.0
210 read_newick(newick, root_node = self, format=format,
--> 211 quoted_names=quoted_node_names)
quoted_node_names = False
212
213
214 def __nonzero__(self):
215 return True
...........................................................................
/software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/ete3-3.1.1-py3.7.egg/ete3/parser/newick.py in read_newick(newick='/project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026.fasta.msa.nw', root_node=Tree node '' (0x2b936ceba5c), format=0, quoted_names=False)
244 nw = nw.strip()
245 if not nw.startswith('(') and nw.endswith(';'):
246 #return _read_node_data(nw[:-1], root_node, "single", matcher, format)
247 return _read_newick_from_string(nw, root_node, matcher, format, quoted_names)
248 elif not nw.startswith('(') or not nw.endswith(';'):
--> 249 raise NewickError('Unexisting tree file or Malformed newick tree structure.')
250 else:
251 return _read_newick_from_string(nw, root_node, matcher, format, quoted_names)
252
253 else:
NewickError: Unexisting tree file or Malformed newick tree structure.
You may want to check other newick loading flags like 'format' or 'quoted_node_names'.
Hi, can you locate the file /project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026.fasta.msa.nw
and report what is in there (if you find the file). Also, do you get the other files for this gene family (all files starting with /project/comparative_genomic_/wg...line/ks_tmp.3858b1da53b934/GF_000026
)?
Hi,
I find the file named by GF_000026.fasta.msa.nw . The file is empty.And I get the other files for this gene family like below.
ks_tmp.3858b1da53b934]$ ls GF_000026*
GF_000026.codeml GF_000026.fasta GF_000026.fasta.msa GF_000026.fasta.msa.nuc GF_000026.fasta.msa.nw
All the files named by *.nw are empty.
You mean for the other families the .nw
files are also empty? Could you paste an example alignment here one of these families (e.g. GF_000026.fasta.msa
)?
I am sorry to the later reply.When I try to run the code on the another computer.It report another error.
2020-04-11 21:29:14: INFO Started analysis in parallel (n_threads = 48) Traceback (most recent call last): File "software/wgd/wgd/wgd", line 11, in <module> load_entry_point('wgd==1.1', 'console_scripts', 'wgd')() File ".local/lib/python3.7/site-packages/click/core.py", line 764, in __call__ return self.main(*args, **kwargs) File ".local/lib/python3.7/site-packages/click/core.py", line 717, in main rv = self.invoke(ctx) File ".local/lib/python3.7/site-packages/click/core.py", line 1137, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File ".local/lib/python3.7/site-packages/click/core.py", line 956, in invoke return ctx.invoke(self.callback, **ctx.params) File ".local/lib/python3.7/site-packages/click/core.py", line 555, in invoke return callback(*args, **kwargs) File "software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd_cli.py", line 632, in ksd File "software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd_cli.py", line 773, in ksd_ File "software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/wgd-1.1-py3.7.egg/wgd/ks_distribution.py", line 645, in ks_analysis_paranome File "software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/joblib-0.11-py3.7.egg/joblib/parallel.py", line 749, in __call__ n_jobs = self._initialize_backend() File "software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/joblib-0.11-py3.7.egg/joblib/parallel.py", line 547, in _initialize_backend **self._backend_args) File "software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/joblib-0.11-py3.7.egg/joblib/_parallel_backends.py", line 317, in configure self._pool = MemmapingPool(n_jobs, **backend_args) File "software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/joblib-0.11-py3.7.egg/joblib/pool.py", line 600, in __init__ super(MemmapingPool, self).__init__(**poolargs) File "software/Python-3.7.3/sxh_configure/lib/python3.7/site-packages/joblib-0.11-py3.7.egg/joblib/pool.py", line 420, in __init__ super(PicklingPool, self).__init__(**poolargs) File "software/Python-3.7.3/sxh_configure/lib/python3.7/multiprocessing/pool.py", line 176, in __init__ self._repopulate_pool() File "software/Python-3.7.3/sxh_configure/lib/python3.7/multiprocessing/pool.py", line 241, in _repopulate_pool w.start() File "software/Python-3.7.3/sxh_configure/lib/python3.7/multiprocessing/process.py", line 112, in start self._popen = self._Popen(self) File "software/Python-3.7.3/sxh_configure/lib/python3.7/multiprocessing/context.py", line 277, in _Popen return Popen(process_obj) File "software/Python-3.7.3/sxh_configure/lib/python3.7/multiprocessing/popen_fork.py", line 20, in __init__ self._launch(process_obj) File "software/Python-3.7.3/sxh_configure/lib/python3.7/multiprocessing/popen_fork.py", line 70, in _launch self.pid = os.fork() OSError: [Errno 12] Cannot allocate memory
And I success in the examle data.Is it due to my big inputs?
Thanks,
Not sure, have you tried using less threads? 48 sounds like a lot? Was your previous issue solved?
I installed WGD with all prerequisites. I tried the following command after preparing the mcl file using the
[ashermoshe@login-0-0 ~/dorothee]$ wgd ksd schlosseri.mcl Botryllus_schlosseri.fas
command, but I got an error somewhere downstream. I would appreciate your help in understanding how to approach it.