aertslab / pySCENIC

pySCENIC is a lightning-fast python implementation of the SCENIC pipeline (Single-Cell rEgulatory Network Inference and Clustering) which enables biologists to infer transcription factors, gene regulatory networks and cell types from single-cell RNA-seq data.
http://scenic.aertslab.org
GNU General Public License v3.0
420 stars 179 forks source link

How to approach GRN inference with Case and Control samples where some TFs might be deleted (in Case) #160

Closed fbrundu closed 4 years ago

fbrundu commented 4 years ago

I have a scRNA-Seq dataset composed of 33k cells of which roughly half of the cells come from Case samples (samples having a specific disease genotype) and the other half come from healthy Controls. We are currently working with a model in which the Case genotype has some deletions that might target some TFs, i.e. the regions that contain such TFs are lost in the Case. Should the inference of regulons be run only on Controls and the AUC scoring on all cells? Or should both the inference of regulons and scoring be run always on the full set of cells?

Thanks!

cflerin commented 4 years ago

Hi @fbrundu ,

It should be fine to run all of your cells together. Even if a TF is missing in some cells it should still be detected in the analysis (if it passes the pruning thesholds, of course). This is how pySCENIC works normally anyway -- it will pick up regulons that are present in each of potentially many cell types within a heterogenous dataset.

fbrundu commented 4 years ago

Ok thanks @cflerin !