single-cell-data / SOMA

A flexible and extensible API for annotated 2D matrix data stored in multiple underlying formats.
MIT License
70 stars 9 forks source link

[python] Also support `coords=None` in `ExperimentAxisQuery` #98

Closed johnkerl closed 1 year ago

johnkerl commented 1 year ago

Originally posted by @bkmartinjr in https://github.com/single-cell-data/SOMA/issues/97#issuecomment-1402435287

thetorpedodog commented 1 year ago

Where are we expecting coords=None to get passed? As an explicit parameter to AxisQuery?

johnkerl commented 1 year ago

https://github.com/single-cell-data/TileDB-SOMA/blob/0.5.0a6/apis/python/src/tiledbsoma/experiment.py#L12

https://github.com/single-cell-data/TileDB-SOMA/blob/0.5.0a6/apis/python/src/tiledbsoma/experiment.py#L75-L76

This could now be

        obs_query: Union[None, somacore.AxisQuery] = None,
        var_query: Union[None, somacore.AxisQuery] = None,
bkmartinjr commented 1 year ago

Sorry, I was vague - John translated (ty!!)

I think the axis_query() equivalent to the coords default is to make obs_query and var_query optional, and have that default to AxisQuery(). The original code worked this way, but it disappeared in your refactor. It is a good UX feature and I had a few complaints about the change already :-)

johnkerl commented 1 year ago

cc also @atolopko-czi who I believe had thoughts here

atolopko-czi commented 1 year ago

Agree with Bruce's response. For the calling code, not specifying a {obs,var}_query keyword arg should be the same as including it with a None value (obs_var=None). This change broke code we had in Cell Census; we had a method that wanted to be able to just pass through a None param value, without having to translate it to AxisQuery().

johnkerl commented 1 year ago

I think this is fine now -- whether we spell coords=None or not the key ask is for a good "all" default and we have this:

>>> exp = soma.Experiment.open("/Users/johnkerl/s/v/pbmc-small-x-csr")

>>> qry = exp.axis_query("RNA")

>>> qry.to_anndata(X_name="data")
AnnData object with n_obs × n_vars = 80 × 20
    obs: 'soma_joinid', 'obs_id', 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'RNA_snn_res.0.8', 'letter.idents', 'groups', 'RNA_snn_res.1'
    var: 'soma_joinid', 'var_id', 'vst.mean', 'vst.variance', 'vst.variance.expected', 'vst.variance.standardized', 'vst.variable'

>>> qry = exp.axis_query("RNA", obs_query=soma.AxisQuery())

>>> qry.to_anndata(X_name="data")
AnnData object with n_obs × n_vars = 80 × 20
    obs: 'soma_joinid', 'obs_id', 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'RNA_snn_res.0.8', 'letter.idents', 'groups', 'RNA_snn_res.1'
    var: 'soma_joinid', 'var_id', 'vst.mean', 'vst.variance', 'vst.variance.expected', 'vst.variance.standardized', 'vst.variable'