angelolab / ark-analysis

Integrated pipeline for multiplexed image analysis
https://ark-analysis.readthedocs.io/en/latest/
MIT License
71 stars 25 forks source link

Adding cropping utility notebook for Pixie #1038

Open cliu72 opened 1 year ago

cliu72 commented 1 year ago

Is your feature request related to a problem? Please describe. For very large whole slide images, even loading one image (all the channels) can cause memory errors. Here, the memory issues happen during preprocessing, which is a step we haven't optimized for memory (we have mainly been working on optimizing training by subsetting pixels).

Describe the solution you'd like A quick solution is to first crop large images into smaller "fovs", and run those smaller images through Pixie normally. Since Pixie doesn't rely on spatial information, cropping the images first shouldn't make a difference. Once the pixel phenotype maps are generated, we can stitch these back together to the original size. We can have a utility notebook that users can run before and after Pixie to crop/decrop their images.

Describe alternatives you've considered We have discussed additional solutions, such as reading in patches of images or reading in one channel at a time. These options will take longer to implement - we will continue discussing these alternatives.

Additional context One consideration is the size of the crops. Perhaps we can just start with 1024 x 1024 crops as a default (and this can be a parameter that users can change). We will also need to think about the logic of re-stitching - we need some way of keeping track of where each crop came from to stitch them back together (maybe specifying "RxCx" in the filename? Or just number them like "crop1" through "cropx" and then keep track of the number of rows/columns somehow? Open to other suggestions).

alex-l-kong commented 1 year ago

@cliu72 agree with defaulting to 1024x1024 crops, though I'm curious to know the dimensions of the images that are causing these issues.

For de-cropping, I prefer RxCx over cropx, the former will be easier to follow along.

cliu72 commented 1 year ago

@cliu72 agree with defaulting to 1024x1024 crops, though I'm curious to know the dimensions of the images that are causing these issues.

For de-cropping, I prefer RxCx over cropx, the former will be easier to follow along.

@alex-l-kong You can see here for some discussion about this: https://github.com/angelolab/ark-analysis/discussions/1034. Some of the large whole slide image can be like 40,000 x 40,000 pixels (for just one "fov").