raphael-group / paste

Probabilistic Alignment of Spatial Transcriptomics Experiments
https://paste-bio.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
77 stars 18 forks source link

Dealing with large datasets on GPU #36

Open peterkilfeather opened 1 year ago

peterkilfeather commented 1 year ago

I am dealing with a dataset of > 400k cells, spanning 18 brain sections, each taken from a separate brain. I am trying to use the center align method with an RTX 3090 (24 GB VRAM), however this has insufficient memory to hold the entire dataset for alignment.

I have tested downsampling to between 2-300k cells, and this appears to work. Can you recommend any approaches to aligning the full dataset on the GPU? I could use an AWS instance with greater VRAM, however I would like to test different parameters (e.g. alpha) and this will quickly rack up computational costs.

mrland99 commented 1 year ago

Hi, sorry unfortunately I do not have a great solution off the top of my head. If you are interested in simply aligning the 18 sections and not inferring a center slice, you could pairwise align consecutive slices two at a time (e.g. 1-2, 2-3, 3-4, etc), and then align all your pairwise matrices accordingly if that makes sense.