rcastelo / GSVA

Gene set variation analysis
200 stars 40 forks source link

GSVA + SingleCellExperiment: Still experimental? #88

Closed llrs closed 1 year ago

llrs commented 1 year ago

I've seen that there is now a method for SingleCellExperiment but one get's a warning that it is experimental (but released at least since 3.16, which I am using.).

I am finding an error using it with GSVA 1.46.0:

g <- gsva(sc3_clusters_h, gsc)
## Error in .mapGeneSetsToFeatures(gset.idx.list, rownames(expr)) : 
##   No identifiers in the gene sets could be matched to the identifiers in the expression data.
## In addition: Warning message:
## In .local(expr, gset.idx.list, ...) :
##   Using 'SingleCellExperiment' objects as input is still in an experimental stage.
summary(geneIds(gsc)[[1]] %in% rownames(sc3_clusters_h))
##    Mode   FALSE    TRUE 
## logical      30     225 
length(gsc)
## [1] 1

I hope this is enough to try to troubleshoot it (sorry for not producing a reproducible example, I can make one if needed).

I think there was a project to make GSVA work with single-cell and spatial transcriptomic data. Not sure if it has yet to start or this might be relevant for those that will work in it. Thanks for maintaining this package!

rcastelo commented 1 year ago

hi Lluís!!

Thanks for reporting this. In principle, there is no SingleCellExperiment,GeneSetCollection-gsva method, only a SingleCellExperiment,list-gsva method, and I had expected that you had got an error of the kind that there is no such a method, but now I realized that is.list(gsc) is TRUE, which explains why your call felt back to the SingleCellExperiment,list-gsva method, which is not prepared to deal with a GeneSetCollection object and it is the cause of the error you reported. So, an inmediate solution for you, would be to call gsva() as follows:

g <- gsva(sc3_clusters_h, geneIds(gsc))

but I have just pushed a fix that allows the software to deal with the SingleCellExperiment,GeneSetCollection-gsva signature, to the devel (1.49.1) and release (1.48.1) versions of GSVA, which should become available in the next 24/48 hrs. You mentioned that you're working with GSVA 1.46.0, so that's Bioconductor 3.16.0, so those fixes will not reach your version of GSVA, and you should use then the previous call to geneIds() to transform the GeneSetCollection object into a base R list object. The following example runs smoothly with this fix in my workstation:

library(Seurat)
library(SeuratData)
InstallData("pbmc3k")
data(pbmc3k)
sce <- as.SingleCellExperiment(pbmc3k)
gs1 <- GeneSet(SymbolIdentifier(), geneIds=rownames(sce)[1:10], setName="GS1")
gs2 <- GeneSet(SymbolIdentifier(), geneIds=rownames(sce)[11:20], setName="GS2")
gsc <- GeneSetCollection(gs1, gs2)
g <- gsva(sce, gsc)
Estimating GSVA scores for 2 gene sets.
Estimating ECDFs with Gaussian kernels    
  |======================================================================| 100%

Warning message:   
In .local(expr, gset.idx.list, ...) :
  Using 'SingleCellExperiment' objects as input is still in an experimental stage.
assay(g)[, 1:2]
    AAACATACAACCAC AAACATTGAGCTAC
GS1      0.4827375     0.36579394
GS2     -0.1440312    -0.09209655

Depending on the features of your data, you may encounter errors running the default GSVA method. In such a case, please try calling GSVA with method="ssgsea", which is why the software still issues the warning you mention.

The CZI project you mention is alive and kicking, we're just slower than I initially planned due to some delays in the recruitment of the taskforce for the project. On April 1st @axelklenk started working on the robustness and single-cell part (btw, Axel, please sync your devel version of GSVA with this bugfix), and on July 1st the second recruit will start working on the spatial transcriptomics part. We expect to fully support single-cell data during the coming one or two devel-cycles, and spatial transcriptomics in the following two or three.

One of the first things that Axel has done has been to clean up some CRAN-specific warnings and notes for the latest April release of GSVA, so I recommend you to switch to the current release 3.17 (GSVA 1.48.1). In any case, since you're based in BCN as well, we can discuss in person any problems or suggestions you may have for GSVA, just DM me on Slack or drop me an email.