prabhakarlab / Banksy

BANKSY: spatial clustering
https://prabhakarlab.github.io/Banksy
Other
74 stars 12 forks source link

Error in RunBanksy with Visium HD data #38

Closed meaksu closed 1 month ago

meaksu commented 3 months ago

Hi, I'm trying to use the RunBanksy function with a merged Seurat Object of four Visium HD samples. I'm running into the following error about negative length vectors. Is this related to the number of cells/bins being too high (1393969 bins) and if so is there any way to get around this issue?

> seu <- RunBanksy(seu, lambda = 0.8, assay = 'Spatial.008um', slot = 'data',
+                 dimx = 'sdimx', dimy = 'sdimy', features = 'variable',
+                 group = 'sample', split.scale = TRUE, k_geom = 50, verbose = TRUE)
Fetching data from slot data from assay Spatial.008um
Subsetting by features
Finding variable features for layer counts
Calculating gene variances
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Calculating feature variances of standardized and clipped values
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Staggering locations by sample
Computing neighbors...
Spatial mode is kNN_median
Parameters: k_geom=50
Done
Computing harmonic m = 0
Using 50 neighbors
Error in `[.data.table`(knn_df, , abs(gcm[, to, drop = FALSE] %*% (weight *  : 
  negative length vectors are not allowed
In addition: Warning messages:
1: In get_data(object, assay, slot, features, verbose) :
  No variable features found. Running Seurat::FindVariableFeatures
2: In asMethod(object) :
  sparse->dense coercion: allocating vector of size 20.8 GiB
jleechung commented 2 months ago

Hi @meaksu, sorry for the delayed response! Yes the issue is caused by the large number of bins. When creating the BANKSY matrix a data table is constructed with number of rows = 1393969 (num. bins) x 2000 (num. features). Since this exceeds 2e31-1 an error is thrown. We are working on scaling this up further by chunking the matrix computation in the feature dimension.

For now, one possible work around is to construct the BANKSY matrix separately for each sample, merge them, and run PCA on the merged matrix.

vipulsinghal02 commented 2 months ago

Also, see this comment for ideas on scaling up! https://github.com/prabhakarlab/Banksy_py/issues/12#issuecomment-2268114768