arzwa / wgd

Python package and CLI for whole-genome duplication related analyses. This package is deprecated in favor of https://github.com/heche-psb/wgd.
http://wgd.readthedocs.io/en/latest/
GNU General Public License v3.0
81 stars 41 forks source link

python error when running wgd #13

Closed dvory-tau closed 5 years ago

dvory-tau commented 5 years ago

I have installed wgd on python 3.6.4, with all the prerequisites. OS linux, centos 6.6. The error I had is below. Your help is appreciated

[ashermoshe@lecs2 ~/dorothee]$ wgd --verbosity debug ksd schlosseri.mcl Botryllus_schlosseri.fas 
2019-02-26 11:14:26: DEBUG  CACHEDIR=/groups/pupko/ashermoshe/.cache/matplotlib
2019-02-26 11:14:26: DEBUG  Using fontManager instance from /groups/pupko/ashermoshe/.cache/matplotlib/fontList.json
2019-02-26 11:14:26: DEBUG  backend agg version v2.2
2019-02-26 11:14:27: INFO   
2019-02-26 11:14:27: INFO   codeml found
2019-02-26 11:14:27: INFO   MUSCLE v3.7 by Robert C. Edgar
2019-02-26 11:14:27: INFO   
2019-02-26 11:14:27: WARNING    Output directory exists, will possibly overwrite
2019-02-26 11:14:27: DEBUG  Reading CDS sequences
2019-02-26 11:14:28: INFO   Translating CDS file
2019-02-26 11:14:28: DEBUG  wrapping excepthook
100% (65587 of 65587) |#################################################| Elapsed Time: 0:00:21 Time:  0:00:21
2019-02-26 11:14:49: WARNING    There were 0 warnings during translation
2019-02-26 11:14:49: INFO   Started whole paranome Ks analysis
2019-02-26 11:14:49: WARNING    Filtered out the 0 largest gene families because n*(n-1)/2 > `max_pairwise`
2019-02-26 11:14:49: WARNING    If you want to analyse these large families anyhow, please raise the `max_pairwise` parameter. 
2019-02-26 11:14:49: INFO   Started analysis in parallel (n_threads = 4)
2019-02-26 11:14:50: INFO   Analysis done
2019-02-26 11:14:50: INFO   Making results data frame
2019-02-26 11:14:50: INFO   Removing tmp directory
2019-02-26 11:14:50: INFO   Computing weights, outlier cut-off at Ks > 5
Traceback (most recent call last):
  File "/share/apps/anaconda3-5.1.0/bin/wgd", line 11, in <module>
    sys.exit(cli())
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/wgd_cli.py", line 545, in ksd
    max_pairwise=max_pairwise
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/wgd_cli.py", line 686, in ksd_
    max_pairwise=max_pairwise,
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/wgd/ks_distribution.py", line 665, in ks_analysis_paranome
    results_frame = compute_weights(results_frame)
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/wgd/ks_distribution.py", line 709, in compute_weights
    df["WeightOutliersIncluded"] = 1 / df.groupby(['Family', 'Node'])[
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/pandas/core/generic.py", line 5162, in groupby
    **kwargs)
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/pandas/core/groupby.py", line 1848, in groupby
    return klass(obj, by, **kwds)
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/pandas/core/groupby.py", line 516, in __init__
    mutated=self.mutated)
  File "/share/apps/anaconda3-5.1.0/lib/python3.6/site-packages/pandas/core/groupby.py", line 2934, in _get_grouper
    raise KeyError(gpr)
KeyError: 'Node'
arzwa commented 5 years ago

This seems like the analysis didn't start for some reason, I would suspect it has something to do with the files. Could you show a head output for schlosseri.mcl and Botryllus_schlosseri.fas?