Comparison of normalisation methods prior to SCENIC analysis?

aertslab / pySCENIC

pySCENIC is a lightning-fast python implementation of the SCENIC pipeline (Single-Cell rEgulatory Network Inference and Clustering) which enables biologists to infer transcription factors, gene regulatory networks and cell types from single-cell RNA-seq data.

http://scenic.aertslab.org

GNU General Public License v3.0

440 stars 182 forks source link

Comparison of normalisation methods prior to SCENIC analysis? #289

Open lucygarner opened 3 years ago

lucygarner commented 3 years ago

Hi,

I was wondering whether you have compared the success of pySCENIC analysis at identifying "true" gene modules downstream of different normalisation methods. In particular, I am interested in whether to use data that has been normalised using basic Seurat normalisation (NormalizeData() function) or SCTransform normalisation (SCTransform function).

Is this something that you or anyone else has tested? They give slightly different outputs on my data, and I don't know whether there is a good method that I can use to decide which is the "best" or "correct" approach.

Many thanks, Lucy

jpcartailler commented 3 years ago

no need to normalize - see https://github.com/aertslab/pySCENIC/issues/128#issuecomment-581324499

lucygarner commented 3 years ago

Thank you. I have an activation dataset so the RNA composition will be altered significantly in the different groups, so genes that don't change upon activation will likely be under-sampled. @s-aibar, I was wondering whether you expect this to have an effect in the AUCell procedure. Although the rankings for particular genes and hence regulons may go down in activated cells, the absolute expression of these genes might not change. So the results from pySCENIC might suggest a reduction in regulon activity, when actually it is just that the activity of other TF regulons has increased dramatically. Is this possible or am I missing something?

lucygarner commented 3 years ago

What is the reason for the use of normalised data for pySCENIC in this publication from the Aerts lab? https://www.nature.com/articles/s41556-020-0547-3

hyjforesight commented 2 years ago

I have the same question! The authors did no normalization (sc.pp.normalize_total() and sc.pp.log1p()) before proceeding to the GRNBoost2 step in the PBMC tutorial (https://github.com/aertslab/SCENICprotocol/blob/master/notebooks/PBMC10k_SCENIC-protocol-CLI.ipynb), but they did sc.pp.log1p() for the cancer dataset (https://github.com/aertslab/SCENICprotocol/blob/master/notebooks/SCENIC%20Protocol%20-%20Case%20study%20-%20Cancer%20data%20sets.ipynb) before GRNBoost2. Which one is right?

MariaRosariaNucera commented 11 months ago

I have the same question! The authors did no normalization (sc.pp.normalize_total() and sc.pp.log1p()) before proceeding to the GRNBoost2 step in the PBMC tutorial (https://github.com/aertslab/SCENICprotocol/blob/master/notebooks/PBMC10k_SCENIC-protocol-CLI.ipynb), but they did sc.pp.log1p() for the cancer dataset (https://github.com/aertslab/SCENICprotocol/blob/master/notebooks/SCENIC%20Protocol%20-%20Case%20study%20-%20Cancer%20data%20sets.ipynb) before GRNBoost2. Which one is right?

same question