Improvement Descriptioncollapse should use biom.Table.collapse which operates on a sparse representation of the data.
Current Behavior
The collapse method requires a FeatureTable transform to pd.DataFrame coercing a dense representation. This is prohibitive for large datasets, artificially requiring in excess of >100GB.
Proposed Behavior
Change collapse to accept biom.Table.collapse. Most of the surrounding changes should be minor as that collapse method accepts an arbitrary function. I recommend using norm=False within the collapse.
Improvement Description
collapse
should usebiom.Table.collapse
which operates on a sparse representation of the data.Current Behavior The
collapse
method requires aFeatureTable
transform topd.DataFrame
coercing a dense representation. This is prohibitive for large datasets, artificially requiring in excess of >100GB.Proposed Behavior Change
collapse
to acceptbiom.Table.collapse
. Most of the surrounding changes should be minor as that collapse method accepts an arbitrary function. I recommend usingnorm=False
within the collapse.