hoffmangroup / segway

Application for semi-automated genomic annotation.
http://segway.hoffmanlab.org/
GNU General Public License v2.0
13 stars 7 forks source link

Better checks for valid genomedata archives #45

Open EricR86 opened 9 years ago

EricR86 commented 9 years ago

Original report (BitBucket issue) by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86).


More than once users have tried to open the HDF5 files contained within a genomedata archive (folder) and not specifying the archive (or folder) itself.

This results in the following error:

Traceback (most recent call last):
  File "/storage/home/lua137/arch/Linux-x86_64/opt/python-2.7.10/bin/segway", line 9, in <module>
    load_entry_point('segway===1.2.0.dev-r0', 'console_scripts', 'segway')()
  File "/gpfs/home/lua137/arch/Linux-x86_64/opt/python-2.7.10/lib/python2.7/site-packages/segway-1.2.0.dev_r0-py2.7.egg/segway/run.py", line 2971, in main
    return runner()
  File "/gpfs/home/lua137/arch/Linux-x86_64/opt/python-2.7.10/lib/python2.7/site-packages/segway-1.2.0.dev_r0-py2.7.egg/segway/run.py", line 2735, in __call__
    self.run(*args, **kwargs)
  File "/gpfs/home/lua137/arch/Linux-x86_64/opt/python-2.7.10/lib/python2.7/site-packages/segway-1.2.0.dev_r0-py2.7.egg/segway/run.py", line 2697, in run
    self.save_observations_params()
  File "/gpfs/home/lua137/arch/Linux-x86_64/opt/python-2.7.10/lib/python2.7/site-packages/segway-1.2.0.dev_r0-py2.7.egg/segway/run.py", line 1749, in save_observations_params
    self.subset_metadata(genome) # XXX: does this need to be done before save()?
  File "/gpfs/home/lua137/arch/Linux-x86_64/opt/python-2.7.10/lib/python2.7/site-packages/segway-1.2.0.dev_r0-py2.7.egg/segway/run.py", line 1548, in subset_metadata
    subset_metadata_attr(genome, "mins", min)
  File "/gpfs/home/lua137/arch/Linux-x86_64/opt/python-2.7.10/lib/python2.7/site-packages/segway-1.2.0.dev_r0-py2.7.egg/segway/run.py", line 1526, in subset_metadata_attr
    attr = getattr(genome, name)
  File "/gpfs/home/lua137/arch/Linux-x86_64/opt/python-2.7.10/lib/python2.7/site-packages/genomedata-1.3.5-py2.7.egg/genomedata/__init__.py", line 441, in mins
    return self._accum_extrema("mins", partial(amin, axis=0))
  File "/gpfs/home/lua137/arch/Linux-x86_64/opt/python-2.7.10/lib/python2.7/site-packages/genomedata-1.3.5-py2.7.egg/genomedata/__init__.py", line 303, in _accum_extrema
    extrema = [getattr(chromosome, name) for chromosome in self]
  File "/gpfs/home/lua137/arch/Linux-x86_64/opt/python-2.7.10/lib/python2.7/site-packages/genomedata-1.3.5-py2.7.egg/genomedata/__init__.py", line 176, in __iter__
    yield self[groupname]
  File "/gpfs/home/lua137/arch/Linux-x86_64/opt/python-2.7.10/lib/python2.7/site-packages/genomedata-1.3.5-py2.7.egg/genomedata/__init__.py", line 210, in __getitem__
    res = Chromosome(self.h5file, where="/" + name)
  File "/gpfs/home/lua137/arch/Linux-x86_64/opt/python-2.7.10/lib/python2.7/site-packages/genomedata-1.3.5-py2.7.egg/genomedata/__init__.py", line 565, in __init__
    if attrs.dirty:
  File "/gpfs/home/lua137/arch/Linux-x86_64/opt/python-2.7.10/lib/python2.7/site-packages/tables-3.2.1-py2.7-linux-x86_64.egg/tables/attributeset.py", line 288, in __getattr__
    "'%s'" % (name, self._v__nodepath))
AttributeError: Attribute 'dirty' does not exist in node: '/supercontig_0'

This error needs to be reported better and a simple check of valid genomedata archives needs to occur before any processing (creating the observations).

EricR86 commented 9 years ago

Original comment by Michael Hoffman (Bitbucket: hoffman, GitHub: michaelmhoffman).


I agree a better error would be better. Is this something that could be part of genomedata instead?