snap-stanford / GEARS

GEARS is a geometric deep learning model that predicts outcomes of novel multi-gene perturbations
MIT License
189 stars 38 forks source link

Directly implementing gears on scRNA-seq data #53

Closed XiaoMi93 closed 2 weeks ago

XiaoMi93 commented 6 months ago

Hi, Thank you for publishing such an excellent paper! I'm new to perturb-seq and only have single cell transcriptome data. I wonder whether I can use gears directly on the scRNA-seq data to infer the perturbation. Or I must provide my own perturb-seq data. Thanks!

XiaoMi93 commented 6 months ago

As I only have scRNA-seq data, so I set the condition to ctrl for all cells during step 2 in the data_tutorial.ipynb file ((2) Create your own Perturb-Seq data). I got this error: AttributeError Traceback (most recent call last) Cell In[50], line 7 ----> 7 pert_data.new_data_process(dataset_name = scRNA', adata = adata) # specific dataset name and adata object

File [~/anaconda3/lib/python3.8/site-packages/gears/pertdata.py:250), in PertData.new_data_process(self, dataset_name, adata, skip_calc_de) 248 os.mkdir(save_data_folder) 249 self.dataset_path = save_data_folder --> 250 self.adata = get_DE_genes(adata, skip_calc_de) 251 if not skip_calc_de: 252 self.adata = get_dropout_non_zero_genes(self.adata)

File ~/anaconda3/envs/Tres/lib/python3.8/site-packages/gears/data_utils.py:64, in get_DE_genes(adata, skip_calc_de) 62 adata.obs = adata.obs.astype('category') 63 if not skip_calc_de: ---> 64 rank_genes_groups_by_cov(adata, 65 groupby='condition_name', 66 covariate='cell_type', 67 control_group='ctrl_1', 68 n_genes=len(adata.var), 69 key_added = 'rank_genes_groups_cov_all') ... 631 'pvals_adj': 'float64', 632 } 634 for col in test_obj.stats.columns.levels[0]:

AttributeError: 'NoneType' object has no attribute 'columns'

I think this is because I only provide one condition. Does this mean that I can not directly implement Gears on non-perturb-seq data?

yhr91 commented 2 weeks ago

Sorry for the delayed response!

It's hard to follow exactly which dataset you used for training but if I understand correctly you tried using non-perturb seq data to train GEARS. This will not work as GEARS needs perturbational data, ideally from a few different genetic perturbations.

If I misunderstood your question, please feel free to re-open this issue.

ManuelMoradiellos commented 2 weeks ago

I too have the same question as @XiaoMi93!

I was wondering if one can use one of the pre-trained models (on Perturb-seq data) and apply them to scRNA-seq data to infer possible transcriptional response to a list of known/given perturbations; not to train the model per se but to apply the ones already trained on a different data source, despite them using different cell types and tissue just to inquire about possible similarities.

Thanks in advance!