Closed jairideout closed 9 years ago
That is weird and suggests some state within the filtered table is not in sync
On Fri, Mar 27, 2015 at 10:03 AM, Jai Ram Rideout notifications@github.com wrote:
When sorting a table that has been filtered so that it is empty (0x0), an uninformative error message is raised. This only seems to happen if the table is filtered to be empty; creating an empty Table outright and sorting it works.
I know this use-case sounds silly (why would you want to sort an empty table?), but this error is raised in QIIME's observation_metadata_correlation.py script if the user provides a metadata category that doesn't have any numeric values, resulting in a filtered table that is empty. A collaborator ran into this error and it wasn't obvious what went wrong and why.
Example:
In [1]: import numpy as np
In [2]: from biom import Table
In [3]: t = Table(np.asarray([[1, 2, 3], [4, 5, 6]]), ['a', 'b'], ['c', 'd', 'e'])
In [4]: t.filter(ids_to_keep=[], axis='sample') Out[4]: 0 x 0 <class 'biom.table.Table'> with 0 nonzero entries (0% dense)
In [5]: t.sort(sortf = lambda : [], axis='sample')--------------------------------------------------------------------------- TableException Traceback (most recent call last)
in ()----> 1 t.sort(sortf = lambda : [], axis='sample') /Users/jairideout/.virtualenvs/qiime/lib/python2.7/site-packages/biom/table.pyc in sort(self, sort_f, axis) 1753 O1 0.0 1.0 3.0 1754 """-> 1755 return self.sort_order(sort_f(self.ids(axis=axis)), axis=axis) 1756 1757 def filter(self, ids_to_keep, axis='sample', invert=False, inplace=True):/Users/jairideout/.virtualenvs/qiime/lib/python2.7/site-packages/biom/table.pyc in sort_order(self, order, axis) 1672 self.ids(axis='observation')[:], order[:], 1673 self.metadata(axis='observation'), md,-> 1674 self.tableid, self.type) 1675 elif axis == 'observation': 1676 for id in order:/Users/jairideout/.virtualenvs/qiime/lib/python2.7/site-packages/biom/table.pyc in init(self, data, observation_ids, sample_ids, observation_metadata, sample_metadata, table_id, type, create_date, generated_by, observation_group_metadata, sample_group_metadata, *kwargs) 258 self._observation_group_metadata = observation_group_metadata 259--> 260 errcheck(self) 261 262 # These will be set by _index_ids()/Users/jairideout/.virtualenvs/qiime/lib/python2.7/site-packages/biom/err.pyc in errcheck(table, errtypes) 471 ret = __errprof.test(table, *errtypes) 472 if isinstance(ret, Exception):--> 473 raise ret 474 else: 475 return retTableException: Number of observation IDs differs from matrix size! — Reply to this email directly or view it on GitHub https://github.com/biocore/biom-format/issues/620.
To solve it, we need to either say that the shape of the table is 0 x len(observation ids)
, or remove observation ids when a table is fully filtered out. What would be preferrable? Relevant code here.
I kind of like updating the empty table code to be smarter?
On Fri, Mar 27, 2015 at 11:28 AM, Jorge Cañardo Alastuey < notifications@github.com> wrote:
To solve it, we need to either say that the shape of the table is 0 x len(observation ids), or remove observation ids when a table is fully filtered out. What would be preferrable? Relevant code here https://github.com/biocore/biom-format/blob/master/biom/_filter.pyx#L87-L88 .
— Reply to this email directly or view it on GitHub https://github.com/biocore/biom-format/issues/620#issuecomment-87022226.
What do you mean?
My question was whether a totally filtered out table should have shape 0, 0
(current behaviour) or 0,n
(would immediately solve this bug). I have a slight prefererence the second option because it seems more consistent:
Filter all but one -> shape (1, n)
Filter all -> shape (0, n)
No matter the option, table.is_empty()
would keep returning True
.
Other issues to consider?
oh, i see. agree, second option is more consistent
On Fri, Mar 27, 2015 at 11:40 AM, Jorge Cañardo Alastuey < notifications@github.com> wrote:
What do you mean?
My question was whether a totally filtered out table should have shape 0, 0 (current behaviour) or 0,n (would immediately solve this bug). I have a slight prefererence the second option because it seems more consistent:
Filter all but one -> shape (1, n) Filter all -> shape (0, n)
No matter the option, table.is_empty() would keep returning True.
Other issues to consider?
— Reply to this email directly or view it on GitHub https://github.com/biocore/biom-format/issues/620#issuecomment-87027446.
Thanks for the quick fix @Jorge-C and @wasade!
When sorting a table that has been filtered so that it is empty (0x0), an uninformative error message is raised. This only seems to happen if the table is filtered to be empty; creating an empty
Table
outright and sorting it works.I know this use-case sounds silly (why would you want to sort an empty table?), but this error is raised in QIIME's
observation_metadata_correlation.py
script if the user provides a metadata category that doesn't have any numeric values, resulting in a filtered table that is empty. A collaborator ran into this error and it wasn't obvious what went wrong and why.Example: