Closed Changzhanhe closed 1 year ago
Hi, thank you for your interest in SpatialPCA! It looks the problem is in the fx_gaussian()
function, which is embedded in the kernel_build_sparse()
function.
First thing to check is, are you using the sparse option to build the kernel matrix? If your sample size is not that large, say, >5000, I would recommend using sparse=FALSE
option, because the estimation would be more accurate.
If your data indeed has a large sample size and you want to use the sparse=TRUE
option, I would recommend checking the location matrix: object@location
, and also the bandwidth: object@bandwidth
, to test, for example, if the function
kernelmat = kernel_build_sparse(kerneltype="gaussian",location=scale(object@location), bandwidth=object@bandwidth, tol = 1e-5, ncores=1)
is working well. Because now you are using standard normalization method in Seurat instead of using SCTransform, the estimated bandwidth from the normalized data will be different, which will affect the kernel matrix.
If these still don't work, please feel free to send me an email (shanglu [at] umich.edu) and share this part of analysis codes or example location matrix that can replicate this error with me, I am happy to help. Thank you.
Thanks for your reply! Actually I was using a very large dataset (cells number nearly 300,000) to run SpatialPCA. And this is my code:
object = SpatialPCA_buildKernel(object, kerneltype="gaussian", bandwidthtype="Silverman", sparseKernel = TRUE, sparseKernel_ncore=3, sparseKernel_tol=0)
Here to save memory I set tol = 0
and I also test the function kernel_build_sparse()
, it didn't work well. The same error as I mentioned before.
I was wondering when I subset scaled location matrix to test mclapply(1:dim(location)[1], fx_gaussian, mc.cores = ncores)
, it can calculate the result list with $ind_i $ind_j $val_i
without any error reported. But when I use the full location matrix, it cannot calculate correctly. I guess probably because of mclapply()
function and I am not sure.
Oh I see, setting tol=0
may cause errors, because we want to keep elements in the kernel matrix with values greater than the threshold. If you set tol=0
it means any value greater than 0 will be retained, which is not what you want.
Also, there is one matrix inversion step in the SpatialPCA_SpatialPCs()
function, which involves an n by n matrix inverse, if you have a sample size ~300,000, I don't know if you have enough disk space for this inverse. I am working on a fast version of SpatialPCA to work around this, will keep you posted once we add this feature to the package. Thank you!
Oh I see, setting
tol=0
may cause errors, because we want to keep elements in the kernel matrix with values greater than the threshold. If you settol=0
it means any value greater than 0 will be retained, which is not what you want.Also, there is one matrix inversion step in the
SpatialPCA_SpatialPCs()
function, which involves an n by n matrix inverse, if you have a sample size ~300,000, I don't know if you have enough disk space for this inverse. I am working on a fast version of SpatialPCA to work around this, will keep you posted once we add this feature to the package. Thank you!
Great! Thank you!
Hi! I was using SpatialPCA to analyze spatial data, and I tried to create a spatial object using standard normalization method in Seurat instead of using SCTransform, I also chose HVG as 'gene.selection'. But when I ran SpatialPCA_buildKernel, it gave the error below:
Warning message in mclapply(1:dim(location)[1], fx_gaussian, mc.cores = ncores): “scheduled cores 1, 2, 3 did not deliver results, all values of the jobs will be affected” Error in
vectbl_as_col_location2()
: ! Can't extract columns past the end. Location 1 doesn't exist. There are only 0 columns.I would appreciate if you know the reason. Thanks!