aertslab / pySCENIC

pySCENIC is a lightning-fast python implementation of the SCENIC pipeline (Single-Cell rEgulatory Network Inference and Clustering) which enables biologists to infer transcription factors, gene regulatory networks and cell types from single-cell RNA-seq data.
http://scenic.aertslab.org
GNU General Public License v3.0
420 stars 179 forks source link

compare result from pyscenic and scenic #207

Open RuiqinZheng opened 4 years ago

RuiqinZheng commented 4 years ago

Thanks for your software!

I found that the final result of pyscenic is not the same as the result of scenic. In the result of scenci, the activity of each transcription factor is either 1 or 0; in the result of pyscenic, the activity of each transcription factor is between 0 and 1,I check the official website of pyscenic , and I didn’t see how to use the threshold to make the result form consistent with scenic. So do you have some suggest?

waiting for your reply.

Ruiqin Zheng

cflerin commented 4 years ago

I'm not totally clear on what you mean, can you show the outputs of some regulons to show what is different?

RuiqinZheng commented 4 years ago

Thanks for your reply! here is my result: from pyscenic(python):

the activity for TF is between 0 and 1

from scenic(R): the activity for TF is 0 or 1 .

the question is how can I set some cutoff to make the result from pyscenic change to 0 or 1?

what I know is the origin result in scenic is also between 0 and 1, and throught some regulation, the activity smaller than threashold will change to 0 ,and the activity bigger than threashold  will set to 1, do you know the method to get the threashold or do you have some script to change the acvivity to 0 or 1?

RuiqinZheng ------------------ 原始邮件 ------------------ 发件人: "aertslab/pySCENIC" <notifications@github.com>; 发送时间: 2020年9月2日(星期三) 晚上8:54 收件人: "aertslab/pySCENIC"<pySCENIC@noreply.github.com>; 抄送: "rqzheng"<501309703@qq.com>;"Author"<author@noreply.github.com>; 主题: Re: [aertslab/pySCENIC] compare result from pyscenic and scenic (#207)

I'm not totally clear on what you mean, can you show the outputs of some regulons to show what is different?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

RuiqinZheng commented 4 years ago

regulonAUC_from_pyscenic AUC for 382 regulons (rows) and 5000 cells (columns).

Top-left corner of the AUC matrix: cells regulons Patient3_PB_3W_UCBT_AAAGATGAGTAGTGCG-1 ALX3(12g) 0.00000000 ARID5B(64g) 0.02108838 ARNT(9g) 0.04152348 ASAP3(12g) 0.00000000 ASCL1(15g) 0.00000000 ASCL2(67g) 0.04375032 cells regulons Patient3_PB_3W_UCBT_AAAGTAGCACCACGTG-1 ALX3(12g) 0.00000000 ARID5B(64g) 0.01220200 ARNT(9g) 0.04686903 ASAP3(12g) 0.01260023 ASCL1(15g) 0.00000000 ASCL2(67g) 0.01024517

Regulon_from_R

                   AGGGATGAGGGTTTCT-1-D20171109_A

ARNT2_extended (18g) 0 ARNTL (11g) 0 ATF2 (88g) 0 ATF2_extended (237g) 0

                   BC02_LYMPHNODE3_130607234345371

ARNT2_extended (18g) 0 ARNTL (11g) 0 ATF2 (88g) 1 ATF2_extended (237g) 0

cnrooter commented 4 years ago

The results from R pipeline were binarized, so the values of regulon activity matrix were 0 or 1. The regulon activity results, which were not binarized, stored in the 3.4_regulonAUC.Rds object in the scenic intermediate directory(int).

saeedfc commented 4 years ago

Hi,

indeed, you have to binarize your auc matrix for automatic thresholding. You may try

from pyscenic.binarization import binarize
binarized = binarize(auc_mtx, num_workers = 22)
binarized.to_csv("binarized_mtx.csv")
RuiqinZheng commented 4 years ago

Thanks for your reply! According to COMMAND LINE result, I found another way to binarize auc matrix in R: Command Line:

pyscenic pyscenic aucell \
       myeloid10k_filtered_scenic.loom  \
        regulons.csv \
        -o pyscenic_output.csv \
        --num_workers 6\
       --seed 1

udocker run -u $USER -w $PWD -v /public/home:/public/home -v $PWD:$PWD pyscenic pyscenic aucell \
       myeloid10k_filtered_scenic.loom  \
        regulons.csv \
        -o pyscenic_output.loom\
        --num_workers 6\
       --seed 1

R:

regulonsAuc<-read.csv("pyscenic_output_genecount.csv",sep=",", header=T,check.names=F)
regulonsAuc_2<-regulonsAuc[,-1]
rownames(regulonsAuc_2)<-regulonsAuc[,1]
regulonsAuc_t<-t(as.matrix(regulonsAuc_2))

library(SCENIC)
library(SCopeLoomR)
pyScenicLoomFile <- file.path(pyScenicDir, "pyscenic_output.loom")
loom <- open_loom(pyScenicLoomFile, mode="r")
regulonsAucThresholds <- get_regulonThresholds(loom)
regulons_binaryAUC<-ifelse(regulonsAuc_t>regulonsAucThresholds ,1,0)
saveRDS(regulons_binaryAUC, "regulons_binaryAUC.rds")

Does it correct?