Processing Large 3D image

romainGuiet commented 3 years ago

Hi @carsen-stringer , In an existing issue thread I found that you recommended : “if you're running out of memory due to the size of the tiff we don't currently have a workaround for that -- we recommend splitting the stack and stitching the masks afterwards.”

I’m wondering how to achieve an optimal stitching of the masks (aka Label images, in which all the pixels of an object have the same value). Indeed stitching of overlapping intensity values looks natural but my first guess is that it is much more complicated for Labels, in 3D! Detecting objects in 3D in the overlapping area, detecting partially/fully overlapping objects, deciding which one to keep … removing objects touching the edge ? Am I overthinking it?

With @lacan we thought it might be “easier” to fuse the flow images ? maybe in temporary files to overcome memory issues and run the other section of CellPose after on the fused image, which might not be so memory hungry? Best, Romain

kevinjohncutler commented 3 years ago

The way you propose, via blind rectilinear cropping of the input, would require some sophisticated overlap-based label stitching/merging, quite similar to how cell linear tracking is done. However, if you are able to threshold your cells well enough to mask out cell clusters, you could actually partition your data via bounding boxes around those clusters. The cells that are cut off in one slice would be fully represented in another, so you could safely throw out any 'edge' labels and avoid the 'stitching' problem altogether (in the sense of stitching together cut-off cells - the resulting 'full' labels could simply be tiled together to form the full label matrix). There is probably even an 'optimal' partition of the data (fewest number of bounding boxes within some size range covering the cell clusters).

joaomamede commented 3 years ago

The overlap is very easy to achieve via dask. you can basically make cellpose be run via dask (that splits the files, that's how I'm deconvolving my 2k by 2k images). However, the labels numbers will mismatch. You could add results = image2_masks + np.max(image1masks) and so on, but the edges would still be badly labelled. If you come up with an idea I'm very interested to apply it as I really need this for live cell imaging panels.

kevinjohncutler commented 3 years ago

Yes, the edge labels are the issue, but the method I proposed earlier would solve this problem. If you are able to partition the images such that every cell appears in a non-cropped sub-image, then you can stitch back the panels and only include labels that are not on the edge. This does assume that cells are not duplicated - i.e. they appear only once in full view in a single sub-image, and can appear on the edge of an adjacent sub-image any number of times. This will be a function of cell density and the overlap of sub-images. How dense are your fields of cells? It would be useful to post an example.

chrisroat commented 3 years ago

dask-image's ndimage.label has an example of stitching masks that might be helpful. After it finishes labelling the blocks, it creates creates a connected components graph.

If you are using dask, you'd need to use map_overlap with large enough depth to overcome cellpose edge effects at the chunk's nominal edge.

carsen-stringer commented 3 years ago

check out this issue and the advice @qiagu and @Michael-Shannon

closing duplicate here https://github.com/MouseLand/cellpose/issues/171

carsen-stringer commented 3 years ago

Hey everyone, I've spent a bit of time thinking about this, probably less time than all of you though. I think the best option (which is pretty similar to @romainGuiet 's proposal I think) would be

the user gives as input a dask array (or cellpose makes a dask array from a folder of tiffs).
Cellpose runs on slices through this array like x[:, i, :] and the rest of the array is not accessed (already flow computation is tiled on the GPU so no need to worry about GPU memory if a slice is big as long as it will fit in RAM).
Each slice -> flow, cellprob gets sent to a zarr array for temporary storage. This means that you are always using overlapping tiles in an optimal way when computing the flows and cell probabilities.
Dynamics are run on overlapping chunks of the zarr array and stitched together. Maybe @chrisroat and @joaomamede can contribute some code for this second step. Any other way to run the dynamics seems really hard and would be difficult to run on the GPU.

You can see the refactoring of the code section here splitting up flow computation and mask computation for this purpose: https://github.com/MouseLand/cellpose/blob/ff46c5f4ebd3e3c580be62765e38dc8108fcd041/cellpose/models.py#L526. In fact you can input a dask array to cellpose right now and it won't error, but it also won't be memory efficient because I have not implemented these intermediate steps to be stored on disk rather than in RAM.

What do people think about this approach?

Michael-shannon commented 3 years ago

Hi @carsen-stringer and everyone. I like this solution! I should be able to contribute by user testing - I have a bunch of big microscopy data that at the moment, I just crop and leave as is prior to cellpose. M

qiagu commented 3 years ago

@carsen-stringer The proposed solution sounds great to me. I saw similar idea, tile iterator, work in other place. Look forward to seeing the idea getting implemented soon.

romainGuiet commented 3 years ago

the user gives as input a dask array (or cellpose makes a dask array from a folder of tiffs).

I definitely upvote the "cellpose makes a dask array from a folder of tiffs" (which would keep compatibility with our Cellpose wrapper for Fiji)

Thank you again for making this tool and being responsive to adding features

Cheers

carsen-stringer commented 3 years ago

Awesome, thanks everyone for the feedback, will start implementing in June. In the meantime feel free to contribute pull requests for any of the steps if you're interested in contributing!

GFleishman commented 3 years ago

Hi All! Thanks @carsen-stringer for pointing me to this thread.

I have built a dask distributed wrapper which runs Cellpose.eval on blocks of a large array in parallel; find the pertinent code here.

The details of setting up a dask cluster are implemented in my ClusterWrap library. Some modifications to this would probably be required to run it on your own cluster - though likely not much especially if you have experience with dask-jobqueue. It should work out of the box for workstations.

If you have a look at my WIP dask implementation you'll see I haven't yet worked out merging the overlap regions.

While I agree with @romainGuiet that fusing flow fields and probabilities is easier than fusing masks (just weighted averaging), the dynamics portion would then run on the whole stitched array(?) which in the very large image case may take quite some time. I also agree with @kevinjohncutler that if blocks could be defined around discrete cell clusters then the mask-stitching problem is much easier (or non-existent), but there are probably many biological situations where this is not possible - mine for example (whole brain zebrafish expansion microscopy).

So I am proceeding with the approach described by @chrisroat and I will post screenshots of my first rounds of testing to inspect the stitch boundaries. So far, I have noticed that within the overlap region the left-tile-segmentation and right-tile-segmentation are pretty consistent but not exactly the same. It may be that simply merging connected components across the boundary has a small percentage of false merges and failed merges. If so - then some kind of merging of the entire overlap region (something like Joint Label Fusion) might be worth considering.

joaomamede commented 3 years ago

Sorry I took a bit to reply:

example 1: Have a look at Libraries.py and Panelize_LiveCell.py

This is how I stitched my files (with 0% overlap).

Example 2: When I want to cut and use overlaps generated by dask:

chunk_size=(frames.sizes['z'],frames.sizes['y']//ydivide,frames.sizes['x']//xdivide)

def depth_divide(xdivide=1,ydivide=1):
    if xdivide >1 and ydivide >1:
        depthdivide = (0,64,64)
    elif xdivide > 1 and ydivide == 1:
            depthdivide = (0,0,64)
    else:
            depthdivide = (0,0,0)

    return depthdivide

arr = da.from_array(frames[time]           
                                    , chunks=chunk_size)

res[:,i,:,:] = da.map_overlap(deconv,arr, depth  = depthdivide ,boundary='reflect',
                                    **kwargs,
                                             ).compute(num_workers=1)

MouseLand / cellpose

Processing Large 3D image #244