StatBiomed / SpatialDM

Spatial direct messaging detected by bivariate Moran
https://spatialdm.readthedocs.io
Apache License 2.0
24 stars 5 forks source link

Bugs discovery for single-cell level spatialDM #25

Open HelloWorldLTY opened 10 months ago

HelloWorldLTY commented 10 months ago

Hi, I met a bug when inferring cci in image-based spatial data:

sdm.weight_matrix(adata, l=1.2, cutoff=0.2, single_cell=True) # weight_matrix by rbf kernel

image

Could you please take a look? Thanks.

HelloWorldLTY commented 10 months ago

There is a following-up question for this step:

image

It seems that there still exists errors and warnings for this step. Moreover, for running a spatial datasetsw ith 4,000 cells and 150,000 genes, cellphonedb is obviously faster than spatialDM (10 mins vs 16 hours), which is not consistent with the efficiency plot shown in the readme file. Are there any approaches to acclecrate spatialDM? Thanks.

ABU-TO commented 8 months ago

sdm.weight_matrix(adata, l=120, cutoff=0.2, single_cell=False) # weight_matrix by rbf kernel The parameter ‘l =120’ here might solve this problem.

Also, I found that the parallel parameter ‘nproc=8’ can significantly reduce the processing time.

HelloWorldLTY commented 8 months ago

Hi, I met same problem here even if I tried both l=120 or l=1200. I think there is lack of information for us to choose a good initial value of l and cutoff. Moreover, I need to set single_cell=True because I am handling single-cell data (like MERFISH).

Also I tried to set nproc = 8, the time usage is the same as nproc =1.

Feanor007 commented 7 months ago

@HelloWorldLTY The

flat not found error

is caused by applying numpy methods on csr_matrix. The following code should solve it:

if singlecell:
# Convert csr_matrix to lil_matrix for efficient row operations
    rbf_d_lil = rbf_d.tolil()

    # Set diagonal elements to zero
    rbf_d_lil.setdiag(0)

    # Convert back to csr_matrix if needed
    rbf_d = rbf_d_lil.tocsr()
Rafael-Silva-Oliveira commented 4 months ago

Hi, I met same problem here even if I tried both l=120 or l=1200. I think there is lack of information for us to choose a good initial value of l and cutoff. Moreover, I need to set single_cell=True because I am handling single-cell data (like MERFISH).

Also I tried to set nproc = 8, the time usage is the same as nproc =1.

I've been trying with VisiumHD data which now can have from 150k to 650k + spots (barcodes) so I think at this stage SpatialDM isn't quite compatible with single-cell level resolution for spatial data unfortunately