chapmanb / bcbb

Incubator for useful bioinformatics code, primarily in Python and R
http://bcbio.wordpress.com
603 stars 243 forks source link

GFF parsing fails with most recent version of BioPython #110

Closed khughitt closed 7 years ago

khughitt commented 7 years ago

Overview

After upgrading to Biopython 1.68, GFF.parse() is now failing where it had no issues before.

To Reproduce

In a new virtualenv environment, run:

pip install numpy
pip install biopython
pip install bcbio-gff

wget http://tritrypdb.org/common/downloads/release-27/TcruziCLBrenerEsmeraldo-like/gff/data/TriTrypDB-27_TcruziCLBrenerEsmeraldo-like.gff

Next, launch python and run:

>>> from BCBio import GFF
>>> gff = 'TriTrypDB-27_TcruziCLBrenerEsmeraldo-like.gff'
>>> x=list(GFF.parse(gff))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/keith/.virtualenvs/gff/lib/python3.5/site-packages/BCBio/GFF/GFFParser.py", line 737, in parse
    target_lines):
  File "/home/keith/.virtualenvs/gff/lib/python3.5/site-packages/BCBio/GFF/GFFParser.py", line 327, in parse_in_parts
    cur_dict = self._results_to_features(cur_dict, results)
  File "/home/keith/.virtualenvs/gff/lib/python3.5/site-packages/BCBio/GFF/GFFParser.py", line 367, in _results_to_features
    results.get('child', []))
  File "/home/keith/.virtualenvs/gff/lib/python3.5/site-packages/BCBio/GFF/GFFParser.py", line 428, in _add_parent_child_features
    children)
  File "/home/keith/.virtualenvs/gff/lib/python3.5/site-packages/BCBio/GFF/GFFParser.py", line 471, in _add_children_to_parent
    cur_child, _ = self._add_children_to_parent(cur_child, children)
  File "/home/keith/.virtualenvs/gff/lib/python3.5/site-packages/BCBio/GFF/GFFParser.py", line 477, in _add_children_to_parent
    cur_parent.sub_features.append(cur_child)
AttributeError: 'SeqFeature' object has no attribute 'sub_features'
>>> import Bio
>>> Bio.__version__
'1.68'

The same code worked with Biopython 1.67, so it seems likely to be an issue resulting from changes made in the 1.68 release.

khughitt commented 7 years ago

Update Just to confirm, I tested the same code after uninstalling BioPython and reinstalling using pip install biopython==1.67, and it worked fine, so it is indeed a 1.68-specific issue.

elenimijalis commented 7 years ago

I'm having this issue as well. Turns out very recently sub_features was deprecated. See https://github.com/biopython/biopython/commit/5befe3ffdfcb730d0a374f0b6e8284d55e3d6a3d I'm choosing to use v 1.67 for now, unfortunately.

peterjc commented 7 years ago

Short term proposal: BCBio could add something like this to cope with Biopython 1.68 onwards:

if not hasattr(cur_parent, "sub_features"):
    cur_parent.sub_features = []

And then continue as usual?

chapmanb commented 7 years ago

Keith and Eleni; Sorry for the issue and thanks so much for reporting it. Peter is right on with the easiest fix, to manually add in sub_features for Biopython >= 1.68. Longer term we should probably have a better replacement with the expected Biopython usage but this gets GFF working again with recent versions as it was previous. I pushed the fix and a new version (0.6.4) with the fix. I'll also update the bioconda package with this version as well. Thank you again for reporting.

khughitt commented 7 years ago

Sounds good - Thanks for the quick fix!

r-bierman commented 6 years ago

Sorry if this is off topic asking about conda

I'm getting the same error (AttributeError: 'SeqFeature' object has no attribute 'sub_features') through conda using what they call version 0.4 here from the install command conda install -c auto bcbio-gff

I used pip to install bcbio-gff and that does work for me but I would prefer to have all the packages managed through conda.

Is there a way to get the fixed version through conda?

peterjc commented 6 years ago

@r-bierman Try https://anaconda.org/bioconda/bcbiogff from BioConda, which has version 0.6.4 which has the fix you need. In general BioConda is an excellent Conda channel to install for Bioinformatics, and you can contribute to updating or adding recipes via GitHub.

r-bierman commented 6 years ago

Works perfectly, thanks!