nuno-agostinho / cTRAP

Identification of candidate causal perturbations from differential gene expression data
https://nuno-agostinho.github.io/cTRAP
Other
5 stars 1 forks source link

Using Seurat or scRNA-seq data #30

Open levinhein opened 2 years ago

levinhein commented 2 years ago

Do you have a vignette tutorial code on how to use cTRAP on Seurat or scRNA-seq data please? Thank you!

nuno-agostinho commented 2 years ago

Hey @levinhein,

There is no tutorial at the moment on scRNA-seq data, but the idea is the same as presented in the current tutorial.

The input to most functions in cTRAP should be a vector that allows to sort genes using their signed statistical values, e.g. t-statistics. Unfortunately, in case of Seurat, using its FindMarkers() function does not return statistical values:

markers <- FindMarkers(pbmc, ident.1 = 0)
head(markers, 8)
#               p_val avg_log2FC pct.1 pct.2     p_val_adj
# RPS12 8.960612e-150  0.7404146 1.000 0.991 1.228858e-145
# RPS6  7.182718e-148  0.6824288 1.000 0.995 9.850379e-144
# RPL32 7.187822e-144  0.6293113 0.999 0.995 9.857378e-140
# RPS27 2.957690e-141  0.7228229 0.999 0.992 4.056176e-137
# RPS14 5.006913e-136  0.6299777 1.000 0.994 6.866480e-132
# RPS25 6.042220e-133  0.7763155 0.997 0.975 8.286301e-129
# CYBA  6.199769e-132 -1.6218478 0.667 0.913 8.502363e-128
# CD74  1.127487e-123 -2.7628423 0.681 0.904 1.546235e-119

As a workaround, please use instead the -log10(adjusted p-values) with the sign of the fold-change:

diffExprStat <- sign(markers$avg_log2FC) * -log10(markers$p_val_adj)
names(diffExprStat) <- rownames(markers)

Afterwards, you can use this vector as the input for cTRAP functions as described in the tutorial, e.g.:

compareKD <- rankSimilarPerturbations(diffExprStat, cmapPerturbationsKD)
predicted <- predictTargetingDrugs(diffExprStat, assoc)

Hope this was clear, but feel free to ask more questions if not.

Best, Nuno