scverse / squidpy

Spatial Single Cell Analysis in Python
https://squidpy.readthedocs.io/en/stable/
BSD 3-Clause "New" or "Revised" License
428 stars 79 forks source link

sq.gr.ligrec crashing kernel/script with VisiumHD (>100k spots/spatial barcodes) - Squidpy incompatible with single-cell level spatial technologies? #819

Closed Rafael-Silva-Oliveira closed 1 month ago

Rafael-Silva-Oliveira commented 5 months ago

Description

VisiumHD will now become one of the standards for single-cell level resolution on spatial data, but this seems to be too much for squidpy to handle as of now.

...

Minimal reproducible example (MRE)

No mre available, but I'm using this dataset that is publicly available: https://www.10xgenomics.com/datasets/visium-hd-cytassist-gene-expression-libraries-of-human-lung-cancer-if

I perform clustering with scanpy (leiden_clusters) using the 16 micron resolution that has 152k spots/barcodes and then try to use this as the "cluster_key" on sq.gr.ligrec:

res = sq.gr.ligrec(adata=adata,     
    n_perms=100,
    cluster_key="leiden_clusters",
    copy=True,
    use_raw=False,
    transmitter_params={"categories": "ligand"},
    receiver_params={"categories": "receptor"},
)

If I try this with a smaller FFPE Visium dataset (< 10k spots), I can run it and see the results just fine.

Changing permutations on the new VisiumHD doesn't change anything.

Traceback

No traceback, kernel/script simply crashes with no information other than:

The Kernel crashed while executing code in the the current cell or a previous cell. Please review the code in the cell(s) to identify a possible cause of the failure. Click [here](https://aka.ms/vscodeJupyterKernelCrash) for more info. View Jupyter [log](command:jupyter.viewOutput) for further details.

Version

python 3.10.2 squidpy 1.4.1 scanpy 1.9.6

Similar to this issue: https://github.com/scverse/squidpy/issues/812 where it crashes with 50k spots, I wonder if squidpy will allow for higher resolution to be used since that's the direction where spatial technologies are heading towards? @giovp

I tested with subsets of the main VisiumHD and MERFISH data and the kernel starts crashing between 30k and 35k spots which makes using new spatial single-cell level technologies not compatible with squidpy

giovp commented 5 months ago

hi @Rafael-Silva-Oliveira , could increase the memory available on your system and try to run it again>?

Rafael-Silva-Oliveira commented 5 months ago

hi @Rafael-Silva-Oliveira , could increase the memory available on your system and try to run it again>?

Hi I'm currently not running on a server, but I will in a few weeks. My local specs are 32gb RAM, i7 and NVIDIA RTX A1000 6gb

I'm positive that this would work okay in a server with more memory, but it would be quite nice if squidpy was optimized to support single-cell level spatial data locally (at least to support 100-200k spots)! :)

giovp commented 5 months ago

I think we might be hitting performance issues on various tools on squidpy with this huge number of cells. I think make the algorithms more memory efficient is not trivial though

Rafael-Silva-Oliveira commented 2 months ago

I think we might be hitting performance issues on various tools on squidpy with this huge number of cells. I think make the algorithms more memory efficient is not trivial though

Thank you for the reply! Would be great if these changes were implemented, but in the meantime I'm open to suggestion that handle HD data :) I'll be testing Liana, but I'll see if it supports such high resolution as well!