theislab / scCODA

A Bayesian model for compositional single-cell data analysis
BSD 3-Clause "New" or "Revised" License
141 stars 23 forks source link

TypeError: '<' not supported between instances of 'str' and 'int' on toy dataset #76

Closed jolespin closed 1 year ago

jolespin commented 1 year ago

I'm trying to get this to work on the iris dataset just to test it out. Ultimately, I plan on using this for gene expression and microbial abundances since it's all compositional.

Anyways, I'm trying to test it out on the iris dataset and got the following error:

import anndata as an
from sccoda.util import comp_ana as mod
from sccoda.util import cell_composition_data as dat
from sccoda.util import data_visualization as viz
import sccoda.datasets as scd

# cell_counts = scd.haber()
# FileNotFoundError: [Errno 2] No such file or directory: '/Users/jespinoz/anaconda3/envs/soothsayer_p3.9_env/lib/python3.9/site-packages/sccoda/datasets/haber_counts.csv'

ad_iris = an.read_h5ad("iris.h5ad")
ad_iris
# AnnData object with n_obs × n_vars = 150 × 4
#     obs: 'Species'

model = mod.CompositionalAnalysis(ad_iris, formula="C(Species)", reference_cell_type="setosa")

Error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [86], in <module>
     11 ad_iris
     12 # AnnData object with n_obs × n_vars = 150 × 4
     13 #     obs: 'Species'
---> 15 model = mod.CompositionalAnalysis(ad_iris, formula="C(Species)", reference_cell_type="setosa")

File ~/anaconda3/envs/soothsayer_p3.9_env/lib/python3.9/site-packages/sccoda/util/comp_ana.py:118, in CompositionalAnalysis.__new__(cls, data, formula, reference_cell_type, automatic_reference_absence_threshold)
    108     return dm.scCODAModel(
    109         covariate_matrix=np.array(covariate_matrix),
    110         data_matrix=data_matrix,
   (...)
    114         formula=formula,
    115     )
    117 # Numeric reference cell type
--> 118 elif isinstance(reference_cell_type, int) & (reference_cell_type < len(cell_types)) & (reference_cell_type >= 0):
    119     return dm.scCODAModel(
    120         covariate_matrix=np.array(covariate_matrix),
    121         data_matrix=data_matrix,
   (...)
    125         formula=formula,
    126     )
    128 # None of the above: Throw error
    129 else:

TypeError: '<' not supported between instances of 'str' and 'int'

Here's zipped anndata object:

iris.h5ad.zip

jolespin commented 1 year ago

Cell type needs to be a column in the features not a class