ImmuneDynamics / Spectre

A computational toolkit in R for the integration, exploration, and analysis of high-dimensional single-cell cytometry and imaging data.
https://immunedynamics.github.io/spectre/
MIT License
57 stars 22 forks source link

Error processing large IMC dataset in advanced analysis - 1. Add Masks #198

Open RoryCostell opened 1 month ago

RoryCostell commented 1 month ago

Hi,

I'm having trouble processing a large IMC dataset. The ROIs are 2000x2000 with a 42-marker panel. There are 61 of these ROIs and they have been segmented using Ilastik successfully (although it took about a week to process).

The issue arises when Spectre is generating the polygons and outlines for the spatial data, the RAM usage for R shoots up and I'm assuming it reaches the limit for my computer (M1 Max, 64Gb RAM).

This is after the command spatial.dat <- do.create.outlines(dat = spatial.dat, mask.name = i)

Initially, it would give the error 'Error: vector memory exhausted (limit reached?)', however, I had to increase it beyond 150Gb - I have attached a screenshot of the usage. Currently, it is set at 250Gb using R_MAX_VSIZE=250Gb in .Renviron

Now R just crashes after the polygon step.

Screenshot 2024-10-01 at 6 16 07 PM

I'm wondering whether it is still possible to analyse these IMC images with Spectre, whether I need to redo the Ilastik step differently or whether I simply need a larger computer to analyse these images?

I can send you the IMC files and masks to try if you need to reproduce the error on your end, but as there is no error code (just a crash), I'm unsure where to go.

ghar1821 commented 1 month ago

Hi @RoryCostell, how big is the list in your spatial.dat? If you only have 1 element in it, then I'm more inclined to say you need bigger computer. But if you have several, maybe see if you can split it into multiple lists and run do.create.outlines a several times (one per list)? After it finishes, then you can recombine the lists and proceed.

RoryCostell commented 1 month ago

Hi @ghar1821 - my list has 61 elements, but the function stops at the first ROI. Sorry for the silly question - is this the right command to see?

Screenshot 2024-10-23 at 2 13 02 PM

ghar1821 commented 3 weeks ago

yes that's the right function to run.

Is it possible to export the current spatial.dat out into either RDS or QS (https://cran.r-project.org/web/packages/qs/vignettes/vignette.html) object, create a new object containing just 1 element in spatial.dat, remove spatial.dat from memory, and run do.create.outlines? That way, the function will have more memory to run.

So something like the following:

library(qs)

qsave(spatial.dat, "myfolder/spatial_dat_full.qs")

spatial.dat.subset <- list()
spatial.dat.subset[[1]] <- spatial.dat[[1]]

rm(spatial.dat)

spatial.dat.subset <- do.create.outlines(dat = spatial.dat.subset, mask.name = i)

qsave(spatial.dat.subset, "myfolder/spatial_dat_1_with_outline.qs")

# load element 2 and repeat. Can put the above into a loop if it works.