scverse / scvi-tools

Deep probabilistic analysis of single-cell and spatial omics data
http://scvi-tools.org/
BSD 3-Clause "New" or "Revised" License
1.25k stars 354 forks source link

DE on data subset #823

Closed chenlingantelope closed 4 years ago

chenlingantelope commented 4 years ago

When running DE on a subset of a dataset with lots of batches, seems like I need to re-index the batches of subsets so that it is consecutive from 0 ~ n_batches. Perhaps this can be done automatically within the DE function.

de_celltype = {}
for tissue in np.unique(adata.obs["tissue"]):
    sub_adata = adata[adata.obs["tissue"] == tissue]
    de_celltype[tissue] = vae.differential_expression(
        sub_adata, groupby="Propagated.Annotation", batch_correction=True
    )

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-143-b1df721788f5> in <module>
      2 for tissue in np.unique(adata.obs['tissue']):
      3     sub_adata = adata[adata.obs['tissue']==tissue]
----> 4     de_celltype[tissue] = vae.differential_expression(sub_adata, groupby = 'Propagated.Annotation', batch_correction=True)

/data/yosef2/users/chenling/miniconda3/envs/scvi-tools/lib/python3.8/site-packages/scvi/core/models/rnamixin.py in differential_expression(self, adata, groupby, group1, group2, idx1, idx2, mode, delta, batch_size, all_stats, batch_correction, batchid1, batchid2, **kwargs)
    187             batch_size=batch_size,
    188         )
--> 189         result = _de_core(
    190             adata,
    191             model_fn,

/data/yosef2/users/chenling/miniconda3/envs/scvi-tools/lib/python3.8/site-packages/scvi/core/models/_utils.py in _de_core(adata, model_fn, groupby, group1, group2, idx1, idx2, all_stats, all_stats_fn, col_names, mode, batchid1, batchid2, delta, batch_correction, **kwargs)
     59             cell_idx2 = (adata.obs[groupby] == group2).to_numpy().ravel()
     60 
---> 61         all_info = dc.get_bayes_factors(
     62             cell_idx1,
     63             cell_idx2,

/data/yosef2/users/chenling/miniconda3/envs/scvi-tools/lib/python3.8/site-packages/scvi/core/utils/differential.py in get_bayes_factors(self, idx1, idx2, mode, batchid1, batchid2, use_observed_batches, n_samples, use_permutation, m_permutation, change_fn, m1_domain_fn, delta, cred_interval_lvls)
    168         eps = 1e-8  # used for numerical stability
    169         # Normalized means sampling for both populations
--> 170         scales_batches_1 = self.scale_sampler(
    171             selection=idx1,
    172             batchid=batchid1,

/data/yosef2/users/chenling/miniconda3/envs/scvi-tools/lib/python3.8/site-packages/torch/autograd/grad_mode.py in decorate_no_grad(*args, **kwargs)
     47         def decorate_no_grad(*args, **kwargs):
     48             with self:
---> 49                 return func(*args, **kwargs)
     50         return decorate_no_grad
     51 

/data/yosef2/users/chenling/miniconda3/envs/scvi-tools/lib/python3.8/site-packages/scvi/core/utils/differential.py in scale_sampler(self, selection, n_samples, n_samples_per_cell, batchid, use_observed_batches, give_mean)
    392             idx = np.random.choice(np.arange(self.adata.shape[0])[selection], n_samples)
    393             px_scales.append(
--> 394                 self.model_fn(self.adata, indices=idx, transform_batch=batch_idx)
    395             )
    396             batch_idx = batch_idx if batch_idx is not None else np.nan

/data/yosef2/users/chenling/miniconda3/envs/scvi-tools/lib/python3.8/site-packages/torch/autograd/grad_mode.py in decorate_no_grad(*args, **kwargs)
     47         def decorate_no_grad(*args, **kwargs):
     48             with self:
---> 49                 return func(*args, **kwargs)
     50         return decorate_no_grad
     51 

/data/yosef2/users/chenling/miniconda3/envs/scvi-tools/lib/python3.8/site-packages/scvi/core/models/rnamixin.py in get_normalized_expression(self, adata, indices, transform_batch, gene_list, library_size, n_samples, batch_size, return_mean, return_numpy)
     82         scdl = self._make_scvi_dl(adata=adata, indices=indices, batch_size=batch_size)
     83         if transform_batch is not None:
---> 84             transform_batch = _get_batch_code_from_category(adata, transform_batch)
     85 
     86         if gene_list is None:

/data/yosef2/users/chenling/miniconda3/envs/scvi-tools/lib/python3.8/site-packages/scvi/model/_utils.py in _get_batch_code_from_category(adata, category)
    131     batch_mappings = categorical_mappings["_scvi_batch"]["mapping"]
    132     if category not in batch_mappings:
--> 133         raise ValueError('"{}" not a valid batch category.'.format(category))
    134     return np.where(batch_mappings == category)[0][0]

ValueError: "0" not a valid batch category.```

#### Versions:
<!-- Output of scvi.__version__ -->
> VERSION

<!-- Relevant screenshots -->
galenxing commented 4 years ago

Hi @chenlingantelope!

I think my open PR #817 fixes this.

Can you pip install that branch and let me know? pip install git+https://github.com/yoseflab/scvi-tools.git@transformbatch_fix

Thanks!

dKlee99 commented 6 months ago

Hi, I have experienced the same issue above when I tried to extract batch corrected scvi normalized expression using transform_batch option. Here is the command I ran.

adata_scvi.layers["scvi_normalized"] = model_scvi.get_normalized_expression(adata_scvi,library_size = 10e3, batch_key="library", transform_batch=['my_batch']) I have installed scvi-tools using following commands git clone https://github.com/scverse/scvi-tools pip install -e .

Installed scvi-tools is version 1.1.3 any ideas how to solve this issue? below is error message


ValueError Traceback (most recent call last) Cell In[62], line 1 ----> 1 adata_scvi.layers["scvi_normalized"] = model_scvi.get_normalized_expression(adata_scvi,library_size = 10e3, batch_key="library", transform_batch=['CHR0211'])

File ~/anaconda3/envs/scvi/lib/python3.9/site-packages/torch/utils/_contextlib.py:115, in context_decorator..decorate_context(*args, kwargs) 112 @functools.wraps(func) 113 def decorate_context(*args, *kwargs): 114 with ctx_factory(): --> 115 return func(args, kwargs)

File /data/_90.User_Data/dlekrud456/4.Programs/scvi-tools/src/scvi/model/base/_rnamixin.py:223, in RNASeqMixin.get_normalized_expression(self, adata, indices, transform_batch, gene_list, library_size, n_samples, n_samples_overall, weights, batch_size, return_mean, return_numpy, **importance_weighting_kwargs) 220 n_samples = n_samples_overall // len(indices) + 1 221 scdl = self._make_data_loader(adata=adata, indices=indices, batch_size=batch_size) --> 223 transform_batch = _get_batch_code_from_category( 224 self.get_anndata_manager(adata, required=True), transform_batch 225 ) 227 gene_mask = slice(None) if gene_list is None else adata.var_names.isin(gene_list) 229 if n_samples > 1 and return_mean is False:

File /data/_90.User_Data/dlekrud456/4.Programs/scvi-tools/src/scvi/model/_utils.py:311, in _get_batch_code_from_category(adata_manager, category) 309 batch_code.append(None) 310 elif cat not in batch_mappings: --> 311 raise ValueError(f'"{cat}" not a valid batch category.') 312 else: 313 batch_loc = np.where(batch_mappings == cat)[0][0]

ValueError: "CHR0211" not a valid batch category.

canergen commented 6 months ago

Hi, the issue points to that transform_batch=['CHR0211'] is not an existing batch category. The function has changed in the meantime from the error message above. Please provide adata.obs[batch_key].value_counts if you think the category should exist.

dKlee99 commented 6 months ago

Hi, Thanks for the reply. I used library as batch_key and here is the result for adata.obs[batch_key].value_counts .

image
canergen commented 6 months ago

Can you check the version of scvi-tools installed?