YosefLab / VISION

Signature Analysis and Visualization for Single-Cell RNA-seq
https://yoseflab.github.io/VISION/
MIT License
158 stars 32 forks source link

Issue with analyze() - Error in colRanks #121

Closed seankodani closed 1 year ago

seankodani commented 1 year ago

Hi,

I'm having an issue using the analyze() function in Vision. Previously (~2 months ago) I had successfully used Vision to characterize the gene expression profile of my scRNAseq datasets but recently I have had trouble reproducing the analysis after I added a couple of samples.

The analyze function runs as expected and goes through running the differential signature tests but afterwards throws an error saying "Error in colRanks(numericMeta, preserveShape = TRUE, ties.method = "average") : Argument 'x' cannot be logical." I tried to look into this error message, but my understanding is that the error is due to an intermediate step of the analysis that I can't refer to for troubleshooting.

Code

pacman::p_load(SingleCellExperiment,
               metap,
               tradeSeq,
               Seurat,
               tidyverse, 
               Matrix,
               scales,
               cowplot,
               viridis,
               VISION)

Load Data


load("./s.object.Rdata")

# Read in expression counts (Genes X Cells)
counts <- s.object@assays$RNA@counts
# Scale counts within a sample
n.umi <- colSums(counts)
scaled_counts <- t(t(counts) / n.umi) * median(n.umi)

# Adding UMAP
projection <- s.object@reductions$umap@cell.embeddings

# Read in meta data (Cells x Vars)
meta = s.object@meta.data

Signatures="./h.all.v2022.1.Hs.symbols.gmt"

Run Vision


vis <- Vision(scaled_counts,signatures = Signatures,meta = meta)
vis <- addProjection(vis, "UMAP", projection)
vis <- analyze(vis)

Output


Using 26110/36601 genes detected in 0.10% of cells for signature analysis.
See the `sig_gene_threshold` input to change this behavior.

Beginning Analysis

Computing a latent space for expression data...

Determining projection genes...
    Applying Threshold filter...removing genes detected in less than 586 cells
      Genes Retained: 8646
    Applying Fano filter...removing genes with Fano < 2.0 MAD in each of 30 bins
      Genes Retained: 1673

Clustering cells...
Using latent space to cluster cells...
completed

Projecting data into 2 dimensions...
  Running method 1/1: tSNE30 ...

Evaluating signature scores on cells...

Warning: mc.cores > 1 is not supported on Windows due to limitation of mc*apply() functions.
  |====================================================================================================================================| 100%, Elapsed 00:01Evaluating signature-gene importance...

Warning: mc.cores > 1 is not supported on Windows due to limitation of mc*apply() functions.
  |====================================================================================================================================| 100%, Elapsed 00:06Creating 5 background signature groups with the following parameters:
  sigSize sigBalance
1      39   1.000000
2      97   1.000000
3     150   1.000000
4     191   1.000000
5     316   0.531139
  signatures per group: 3000
Computing KNN Cell Graph in the Latent Space...

Evaluating local consistency of signatures in latent space...

Warning: mc.cores > 1 is not supported on Windows due to limitation of mc*apply() functions.
  |====================================================================================================================================| 100%, Elapsed 00:01Warning: mc.cores > 1 is not supported on Windows due to limitation of mc*apply() functions.
  |====================================================================================================================================| 100%, Elapsed 05:12Warning: mc.cores > 1 is not supported on Windows due to limitation of mc*apply() functions.
  |====================================================================================================================================| 100%, Elapsed 02:17Warning: mc.cores > 1 is not supported on Windows due to limitation of mc*apply() functions.
  |====================================================================================================================================| 100%, Elapsed 00:00Clustering signatures...

fitting ...
  |===================================================================================================================================================| 100%
Computing differential signature tests...

  |====================================================================================================================================| 100%, Elapsed 00:02

Error in colRanks(numericMeta, preserveShape = TRUE, ties.method = "average") : 
Argument 'x' cannot be logical.
deto commented 1 year ago

The error must be happening here (wish R had stack traces...)

which suggests that your meta-data dataframe has columns that are boolean/logical. However, the previous few lines should filter those out:

    numericMeta <- vapply(seq_len(ncol(metaData)),
                          function(i) is.numeric(metaData[[i]]),
                          FUN.VALUE = TRUE)
    numericMeta <- metaData[, numericMeta, drop = F]
    numericMeta <- as.matrix(numericMeta)

Still, to debug, I'd look at your columns in the cell data and maybe try running on a small subset of them (I get the feeling something strange about one of the columns is causing this)

seankodani commented 1 year ago

Hey David,

Thanks for the response. Originally my metadata consisted of character, number, integer, and factor columns. I tried to run the analysis without the metadata and instead got the below error:

fitting ...
  |==============================================================================================================================| 100%
Computing differential signature tests...

  |                                                                                                                      |   0%, ETA NA
Error in `[<-.data.frame`(`*tmp*`, , 2, value = 0) : 
  replacement has 1 row, data has 0

I tried to run a subset of the metadata consisting of just nCount_RNA and nFeature_RNA resulting in the colRanks error message and a subset of the metadata consisting of just the sample data and the cluster data resulting in the "replacement has 1 row, data has 0" error. I tried converting the columns into just numeric and factor metadata, again resulting in colRanks error.

My apologies for being so helpless on this - thank you sincerely for the help!

Guan-Pujun commented 1 year ago

Likewise, I also encountered the same issue when running the demo "Introduction to VISION". I would greatly appreciate any assistance you could provide.

Session Info: R version 4.2.2 (2022-10-31 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale: [1] LC_COLLATE=English_United States.utf8 LC_CTYPE=English_United States.utf8
[3] LC_MONETARY=English_United States.utf8 LC_NUMERIC=C
[5] LC_TIME=English_United States.utf8

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] VISION_3.0.0

deto commented 1 year ago

Were you running it with the files in the repo or with your own data?

I just ran it with the data in the vignettes/data directory from the repo and it worked fine. I'm having a hard time with this error as I cannot reproduce it to debug it.

Guan-Pujun commented 1 year ago

I used the provided files (expression_matrix.txt.gz, glio_meta.txt.gz, and h.all.v5.2.symbols.gmt). All my packages and R are uptodate.

TarikExner commented 1 year ago

Hey,

encountered the same issue with my own data, both with a .gmt file and my own GeneSignatureObject.

It worked for me (fresh installation today), but after the first successful run of VISION, Seurat threw an error in the FindNeighbors function so I installed Matrix and matrixStats again, which solved the Seurat problem but led to the error as described above.

When I used the sample dataset from the vignette, I noticed this Warning:

Evaluating signature scores on cells...

as(<matrix>, "dgeMatrix") is deprecated since Matrix 1.5-0; do as(as(as(., "dMatrix"), "generalMatrix"), "unpackedMatrix") instead

Installing Matrix 1.4.1. solves the VISION problem but Seurat is requiring >1.5.0.

Installing Matrix 1.5.0 and 1.5.1 (direct predecessor of 1.5.3 as far as I can tell) solve the VISION problem but lead to the Seurat Error:

Error in validityMethod(as(object, superClass)) : 
  object 'CsparseMatrix_validate' not found

which due to this thread is solved by installing Matrix 1.5.3: https://github.com/satijalab/seurat/issues/6746

When I run the code (clusterSigScores) line by line with a freshly created VISION object I can reproduce the error, most probably in this specific setting because obj@SigScores is empty. This is just an idea if maybe the obj passed to the function already has no valid SigScores..

I will try different Seurat Versions to make it happen, I will keep you posted.

deto commented 1 year ago

@TarikExner thanks for looking into it some more. I thought the colRanks error was happening on the call with numericMeta in which case, it's actually not running on anything coming out of the SigScores object?

Also, are you using a newer Macbook with the M1 silicon? For another person with this problem, this was the case (and I wasn't able to replicate the error, even with their same RDS file and same library versions). So I'm wondering if there is some subtle issue caused by the different architecture. If so, I don't think it's an insurmountable problem at all, we'll just have to figure out exactly what inputs the colRanks function is picky about in that situation.

liron27 commented 1 year ago

Hi I am also running into the same issue. It used to work well a few weeks ago. Here is my code (I don't have metadata):

vis<-Vision(obj, signatures = c("./gene sets/hallmark.symbols.gmt"),projection_methods   = "UMAP")
vis <- analyze(vis)

Here is the error:

Computing differential signature tests...

|===============================================================================================================================| 100%, Elapsed 00:00 | | 0%, ETA NA Error in [<-.data.frame(*tmp*, , 2, value = 0) : replacement has 1 row, data has 0

TarikExner commented 1 year ago

No its not directly the SigScores object, what I meant was that this error happens when you subset the metadata based on the rownames of SigScores:

metaData <- metaData[rownames(sigScores), , drop = FALSE]

I created a VISION object with the supplied data:

object <- Vision(scaled_counts, //read in as in your tutorial
              signatures = c("/data/h.all.v5.2.symbols.gmt"),
              meta = meta)

// what follows is copied from the source code

sigScores <- object@SigScores //returns a logical NA, as its empty
metaData <- object@metaData
metaData <- metaData[rownames(sigScores), , drop = FALSE] ### if you skip this line everything works fine

numericMeta <- vapply(seq_len(ncol(metaData)),
                      function(i) is.numeric(metaData[[i]]),
                      FUN.VALUE = TRUE)
numericMeta <- metaData[, numericMeta, drop = F]
numericMeta <- as.matrix(numericMeta)

if (ncol(numericMeta) > 0){
  numericMetaRanks <- colRanks(numericMeta,
                               preserveShape = TRUE,
                               ties.method = "average")
  dimnames(numericMetaRanks) <- dimnames(numericMeta)
} else {
  numericMetaRanks <- numericMeta
}

with the skipped line:

typeof(numericMeta) [1] "integer"

with the line included:

typeof(numericMeta) [1] "logical"

This led to the hypothesis that maybe the underlying sigScores object has some kind of problem as it might be a logical when passed in the function. I dont know if the creation of sigScores is dependent on Matrix, which might explain the version issues. Sorry for the confusion.

I am not working on a MacBook.

deto commented 1 year ago

Thanks for your help @TarikExner - the issue IS in fact in the sigScores object. Earlier in the function, the meta-data is re-ordered to be consistent with the SigScores dataframe. There was some sort of change in the Matrix package between v1.5.1 to v1.5.3 that is resulting in the sample labels not getting preserved during the various matrix operations of signature calculation. Hoping to have a fix out by the end of the week once I can track down exactly where the labels are getting dropped (and maybe file a bug report with Matrix if the change appears to be something they did not intend).

deto commented 1 year ago

Should be fixed in the newest release (v3.0.1). Please re-install and try it out.

liron27 commented 1 year ago

I’m still getting the same error after re-installing….Sent from my iPhoneOn Feb 8, 2023, at 21:08, David DeTomaso @.***> wrote: Should be fixed in the newest release (v3.0.1). Please re-install and try it out.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

liron27 commented 1 year ago

Sorry. It was fixed after restarting R. Thank you!Liron Sent from my iPhoneOn Feb 8, 2023, at 23:02, Liron Grossmann @.> wrote:I’m still getting the same error after re-installing….Sent from my iPhoneOn Feb 8, 2023, at 21:08, David DeTomaso @.> wrote: Should be fixed in the newest release (v3.0.1). Please re-install and try it out.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

Guan-Pujun commented 1 year ago

It worked! Thank you very much!

ximenshaoshao commented 1 year ago

Hi Same issue still exists when running VISION 3.0.1 on R 4.2.3

Evaluating signature scores on cells...
Error: (converted from warning) 'as(<matrix>, "dgeMatrix")' is deprecated.
Use 'as(as(as(., "dMatrix"), "generalMatrix"), "unpackedMatrix")' instead.

My Matrix package version is 1.5-4, not 1.5-3

Maybe downgrade Matrix to 1.5-3 should worked?