Closed Brawni closed 3 years ago
Hi @Brawni
i see you have commented out line with SerialParam()
, does the problem happens with bpparam
set to SerialParam
as well?
with SerialParam() I get a segfault error, which seemed to be not an improvement, as follows:
*** caught segfault ***
address 0x250, cause 'memory not mapped'
Traceback:
1: fgseaMultilevelCpp(x[, ES], stats, unique(x[, size]), sampleSize, seed, eps, sign)
2: FUN(...)
3: doTryCatch(return(expr), name, parentenv, handler)
4: tryCatchOne(expr, names, parentenv, handlers[[1L]])
5: tryCatchList(expr, classes, parentenv, handlers)
6: tryCatch({ FUN(...)}, error = handle_error)
7: withCallingHandlers({ tryCatch({ FUN(...) }, error = handle_error)}, warning = handle_wa
rning)
8: FUN(...)
9: doTryCatch(return(expr), name, parentenv, handler)
10: tryCatchOne(expr, names, parentenv, handlers[[1L]])
11: tryCatchList(expr, classes, parentenv, handlers)
12: tryCatch(FUN(...), error = identity)
13: FUN(X[[i]], ...)
14: lapply(X, FUN_, ...)
15: bplapply(multilevelPathwaysList, function(x) { eps <- eps * min(x$denomProb) return(fgseaMultil
evelCpp(x[, ES], stats, unique(x[, size]), sampleSize, seed, eps, sign))}, BPPARAM = BPPARAM)
16: bplapply(multilevelPathwaysList, function(x) { eps <- eps * min(x$denomProb) return(fgseaMultil
evelCpp(x[, ES], stats, unique(x[, size]), sampleSize, seed, eps, sign))}, BPPARAM = BPPARAM)
17: multilevelImpl(multilevelPathwaysList, stats, sampleSize, seed, eps, sign = sign, BPPARAM = BPPAR
AM)
18: fgseaMultilevel(pathways, fgsea_ranks, minSize = 15, maxSize = 1500, BPPARAM = SerialParam(), sco
reType = "std")
19: eval(ei, envir)
20: eval(ei, envir)
21: withVisible(eval(ei, envir))
22: source("/ahg/regevdata/projects/ICA_Lung/Bruno/scripts/scrna_pipeline/DEG_standard.R")
Oh, that's really interesting. Can you attach the pathways you are using?
Also, in the list you've attached there are just 615 genes. Are these all the genes you consider?
Actually, please upgrade fgsea to the latest version (using devtools::install_github('ctlab/fgsea')
). There was a bug we fixed that could cause a similar problem.
converting the pathways file to txt is too big (>1GB) and I cant upload .rds on here. Do you know other means for me to upload it? Yes these genes are coming from a differential expression which filter out those not expressed in both groups in terms of percentage of cells expressing it (scRNA) and those with logFC lower than 0.25. Should I have a more inclusive list to start with? I will update to the last version and report back!
Thanks!!
Hi! After updating to the last version I get a different error:
Error in `rownames<-`(`*tmp*`, value = sig_terms) :
attempt to set 'rownames' on an object with no dimensions
In addition: Warning messages:
1: In preparePathwaysAndStats(pathways, stats, minSize, maxSize, gseaParam, :
All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
2: In fgseaMultilevel(pathways, fgsea_ranks, minSize = 15, maxSize = 1500, :
There were 3 pathways for which P-values were not calculated properly due to unbalanced (positive and negative) gene-level statistic values. For such pathways pval, padj, NES, log2err are set to NA. You can try to increase the value of the argument nPermSimple (for example set it nPermSimple = 10000)
Hi @Brawni !
I saw that your pathway file is quite large. But without a reproducible example, it is quite difficult for us to understand what is going wrong. I can suggest to choose a subset of pathways that lead to errors in the fgseaMultilevel
function. To do this, you can try the following example
pathwaysToCheck <- list()
for (gsn in names(pathways)){
tryCatch({
set.seed(1)
fr <- fgseaMultilevel(pathways[gsn],
fgsea_ranks,
minSize = 15,
maxSize = 1500)
}, error = function(e){
tmp <- get("pathwaysToCheck", env=globalenv())
assign("pathwaysToCheck",
c(tmp, pathways[gsn]),
envir = globalenv())
})
}
To determine only those pathways on which fgesaMultilevel
crashes with an error. If after that the file is still too large, then you can select several pathways from pathwaysToCheck
.
HI @vdsukhov! I tried to reproduce the error several times but now it seems to work. I guess the last error I reported was due to some issues on my side and updating to the latest version fixed the initial problem, sorry not be more specific. If I encounter the error again I will run the code you posted and report back!
Thanks a lot!
Yes these genes are coming from a differential expression which filter out those not expressed in both groups in terms of percentage of cells expressing it (scRNA) and those with logFC lower than 0.25. Should I have a more inclusive list to start with?
@Brawni You should use all expressed genes for GSEA, don't filter the results by logFC.
Hi! thanks a lot for this great package!
I have been using fgsea to run in loop on different set of genes that I rank by -log10(pvalue) * sign (logFC). These lists of genes are of different sizes (as genes not expressed sufficiently in the groups or not passing a logFC are removed): 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 804 1360 2130 615 986 2796 900 951 923 2144 1766 1867 1427 1057 1787 1393
The code used is:
However I get the following error in this case when I run fgseaMultilevel for group '3', although this doesn't happen ALL the times:
Please find attached the offending list of genes fgsea_ranks_group3.txt
I think this is a similar problem to this post from biostars: https://www.biostars.org/p/9478106/
Thanks!
SessionInfo: sessionInfo.txt