aertslab / pySCENIC

pySCENIC is a lightning-fast python implementation of the SCENIC pipeline (Single-Cell rEgulatory Network Inference and Clustering) which enables biologists to infer transcription factors, gene regulatory networks and cell types from single-cell RNA-seq data.
http://scenic.aertslab.org
GNU General Public License v3.0
420 stars 179 forks source link

interpretation of heatmap of sns.clustermap(auc_mtx) #56

Closed yuliusema4 closed 5 years ago

yuliusema4 commented 5 years ago

Hi,

Could you explain how to interpret the heatmap? The dark part (close to 0) means the regulon not active? If so vast majority of cells have nonactive regulons. Also, how to use binarize(auc_mtx)? Could you give some example code for binarize and use the results for heatmap? Some papers said scenic is sensitive to dropout, could you suggest how sensitive it is?

Thanks for such great tools and quick response, anyway!

bramvds commented 5 years ago

Hi,

  1. AUCell calculates an "enrichment score" for the predicted target genes of each regulon. Because of this fact, it is important to focus on the distribution of these enrichment scores across cells for a regulon when trying to interpret these scores (and less to compare activity between regulons). This allows you to better interpret these scores - not only does this put the range of scores for the regulon into context, the shape of the distribution is also informative: bimodal distributions clearly indicate the presence of two populations of cells (one type of cells in which the regulon is ON and another where the regulon is OFF). However this bimodal shape is not always the case reflecting another biological reality being present in your experiment.
from pyscenic.binarization import binarize

bin_mtx, thresholds = binarize(auc_mtx)
  1. Regarding the sensitivity for dropouts, could you give me the citations of these papers (for my own interest)? In general, I believe that AUCell is quite insensitive to dropouts as it calculates the enrichment of the whole targetome of a TF for each cell and so the enrichment score will only slightly vary when a target gene is not present in the gene expression profile of an individual cell due to a dropout event.

Kindest regards, Bram