Closed paulrbuckley-kcl closed 10 months ago
For a given cell count n
, the function needs to compute n^2
pairwise distances subject to thresholding, each of which is stored as float64
(i.e., 8 bytes). For around n = 500 000
cells, which seems to be the case here, this would indeed result in a memory footprint of 1.8TB. Unfortunately, there is no easy way around this with the current implementation. You could of course operate on tiles (steinbock utils mosaics
), but then you would obviously miss neighbors across tile borders. The best solution, however, would be to not store distances that exceed dmax
in memory, which would require a rewrite of this functionality. Maybe the current maintainer of steinbock, @Milad4849, can comment on whether this is something that is on the roadmap.
Oh, almost forgot: you might want to try --type borders
or --type expansion
, together with --mmap
enabled, as these functions do not compute all pairwise distances. The command may take a long time, though, so no guarantee that this will work within reasonable time.
thanks, v helpful. What I find odd is that for 450k cells there's no issue with 60GB of memory. Not quite got my head around that. I'm assuming them reducing dmax = e.g., 10 won't help? I will absoluytely try --type borders with mmap. Thanks
Hi all,
For a handful of large CODEX WSI (but no larger than others that run fine), I receive the following error when running neighbor measurements:
It's asking for 1.65Tib of memory which is obviously enormous.. I'm wondering if you had any suggestions for how to work around this? The command is below - was thinking to change dmax? Thanks for your help!
$steinbock measure neighbors --type centroids --dmax 15 --masks /data/masks/ -o /data/neighbors_thresholddist/