Closed gilstel closed 3 months ago
Hi @gilstel,
Which version of decoupleR are you running? It could be the case you are running a very old one. Try installing the latest version of decoupleR:
install.packages('remotes')
remotes::install_github('saezlab/decoupleR')
Alternatively you can switch to the python version of decoupler since it is more scalable than the R one. Let me know how it goes!
Hi @PauBadiaM
We are using decoupleR version 2.6.0 (I tried to look in the github page to see if this was the latest but couldn't find it easily).
I like having decoupleR in R so that I can use the data from my Seurat object.
I assume that if I will use the python version I would first need to export the (SCT) assay data from R to a file and then import it somehow into the python environment in order to use it there.
Irrespectively, since I have mouse data I used the the following command -
> net = get_collectri(organism="mouse", split_complexes=FALSE)
[2024-05-22 14:19:34] [SUCCESS] [OmnipathR] Downloaded 64495 interactions.
> net
# A tibble: 42,595 × 3
source target mor
<chr> <chr> <dbl>
1 MYC TERT 1
2 SPI1 BGLAP 1
3 SMAD3 JUN 1
4 SMAD4 JUN 1
5 STAT5A IL2 1
6 STAT5B IL2 1
7 RELA FAS 1
8 WT1 NR0B1 1
9 NR0B2 CASP1 1
10 SP1 ALDOA 1
Later on, when I used run_ulm (which took several minutes) it only found one source
>mat.sct.assay = as.matrix(named.clust.obj.minus.clust.32.33@assays$SCT@data)
>dim(mat.sct.assay)
> # Run ulm
> acts.sct.assay.minsize.5 = run_ulm(mat=mat.sct.assay, network = net, .source='source', .target='target', .mor='mor', minsize = 5)
> unique(acts.sct.assay.minsize.5$source)
[1] "HNF4A"
> acts.sct.assay.minsize.5
# A tibble: 30,459 × 5
statistic source condition score p_value
<chr> <chr> <chr> <dbl> <dbl>
1 ulm HNF4A channel.1_AAACCCAAGAGGGTCT-1 1.08 0.280
2 ulm HNF4A channel.1_AAACCCAGTCTTCATT-1 0.890 0.374
3 ulm HNF4A channel.1_AAACCCATCTGGGTCG-1 0.0927 0.926
4 ulm HNF4A channel.1_AAACGAAAGAAACTCA-1 0.575 0.566
5 ulm HNF4A channel.1_AAACGAACACGAGGAT-1 0.0393 0.969
6 ulm HNF4A channel.1_AAACGAAGTTCAGCTA-1 0.624 0.532
7 ulm HNF4A channel.1_AAACGAATCACTGGGC-1 0.927 0.354
8 ulm HNF4A channel.1_AAACGCTAGATGTTAG-1 1.84 0.0653
9 ulm HNF4A channel.1_AAACGCTCAAGGTACG-1 1.08 0.280
10 ulm HNF4A channel.1_AAACGCTTCGTCAGAT-1 -0.342 0.733
However, when I first translated gene symbols in net from UPPERCASE to Sentence Case
library(snakecase)
net$source = to_sentence_case(string = net$source, sep_out = "")
net$target = to_sentence_case(string = net$target, sep_out = "")
followed by run_ulm (which took several hours to run)
> length(unique(curr.condition.acts.sct.assay.minsize.5$source))
[1] 690
> head(unique(curr.condition.acts.sct.assay.minsize.5$source))
[1] "Myc" "Spi1" "Smad3" "Smad4" "Stat5a" "Stat5b"
Maybe the network file should be updated with gene symbols in mouse format?
Hi @gilstel,
Did you install from github? The current R version of decoupleR is 2.9.7, and the python version is 2.6.0. Double check that you installed the latest R version, you can do this by running packageVersion("snow")
.
It looks like the mouse conversion did not work which is an old bug in decoupleR, you should update it and try again (remember to restart your R session). Let me know how it goes.
I used run_ulm on the data (tried RNA as well as SCT assay data, separately) from a Seurat object and it took several hours to run on Rstudio (we have a very strong Rstudio-dedicated server with 52 cores and 512 Gb memory). I left it to run overnight and when I checked it the next morning the prompt didn't come back so I aborted the command using escape. It would be very helpful to have some sort of progress bar which would indicate how the function if progressing. It would be also very helpful to be able to set the amount of cores the function uses like there is in run_dorothea or maybe you could use the future package (that is used in the Seurat package to run various functions faster) for speeding it up.
Many thanks