Fusion of single-view data?

dpshepherd commented 3 years ago

Hi all,

Super exciting to find this project! We are acquiring large light sheet data sets using a high NA oblique plane microscope (publication). We normally use BigStitcher to stitch and fuse our data. However, we acquire our data in Python and it would be great to stay in the eco-system. We are also searching for faster ways to fuse our laterally very large images with shallow z depth.

Is it possible to fuse multiple tiles (>30 tiles) from a single-view at the moment? If so, mind giving us a hint what functions to look at to get started?

Thanks! Doug

m-albert commented 3 years ago

Hi Doug, I've seen your paper and find high NA oblique plane light sheet microscopy super exciting! Cool that you're acquiring your data in Python :)

Could this package be useful for registering and fusing your data? First, I should mention that 1) workflows are currently tailored around processing multi-angle Zeiss Z1 acquisitions from input to fused output, 2) so far the focus has been fitting the processing of large stacks into memory rather than speed optimization and 3) the code base is under development and not yet very clean.

Having mentioned that, MVRegFus can register and fuse single (rotational) view tiles in its current state. To illustrate how it can be done, I uploaded a jupyter notebook with an example stitching workflow. The notebook is in branch tif_input which allows loading views from tif files https://github.com/m-albert/MVRegFus/issues/6. What image sizes are you working with and how fast would you ideally want the processing to be? Feel free to try out the notebook and let me know if I can be of any help in processing your data.

Currently, I'm working on making this code base simpler and improve chunked processing. Another aim is to use napari to provide a more immediate visual feedback on the registration/fusion process (e.g. making use of https://github.com/napari/napari/pull/1616 for pre-fusion visualization). This could be interesting for your use case, as in principle only the overlapping image regions (e.g. as estimated by your metadata) would need to be loaded for registration and a fusion could then be stitched on the fly during visualization.

Cheers! Marvin

dpshepherd commented 3 years ago

Thanks for the fast response and working example!

Right now, the stage scanned variant of the high NA OPM in the group operates as a high-speed 3D slide scanner. Each "tile" from the instrument is 1,000s of individual images of the oblique plane with the sample translated 200 nm between images. This gives raw individual tile dimensions of [10000 -> 100000,1600,256]. We then repeat as necessary for any changes in Z (sample height) or C (channel) for that particular Y (non-scanning axis) position. The Y axis then moves and the process repeats.

Typically we are generating datasets on the order of a few TB to 50 TB for the tissue samples we are imaging. Once the data is acquired, we deskew it using a Numba routine that is found in recon_pycro_data.py. We save the output of this as a BigDataViewer compatible H5 file using npy2bdv where each tile in the H5 is one CZYX that has been deskewed. Instead of creating the BDV H5, we can write the output as TIFFs or save a standard H5 that is easier to load with Dask.

After writing the deskewed data in the BDV H5, we run a FIJI macro code that loads the BDV H5 into BigStitcher, roughly places the tiles in the correct position, and runs the BigStitcher stitching pipeline. BigStitcher is fantastic, but it really struggles to fuse this data at full resolution because it has so many individual tiles. For example, to fuse a ~550,000 voxel dataset at full resolution, BigStitcher requires roughly 4 days per channel to fuse into a new HDF5 file using a server with 48 cores and 1 TB of RAM.

To make matters worse, one of the main experiments in the group is iterative multiplex imaging. We place the tissue slice in fluidics chamber on the OPM and repeatedly image the same tissue slice with different fluorescent labels for N rounds. We need to fuse each imaging round and then register them to each other using fiducial markers before heading into the analysis.

We will give your script a trial with a trial dataset and see how it goes. Thanks!

m-albert commented 3 years ago

Thanks for the detailed description of your data and processing pipeline!

This gives raw individual tile dimensions of [10000 -> 100000,1600,256].

I'm afraid using these tiles directly as input would be a problem for the current MVRegFus code, which 1) assumes that at least a single view can fit into memory. Also, it performs pairwise registrations on the full overlap between the two tiles, which according to your description could then be two stacks of 20% of 100000*1600*256, amounting to around 40GB. While these might fit into the memory of a server, non-blockwise registration might be ineffective compared to the accuracy it'd need to achieve.

A solution would be to subdivide the tiles along the first dimension, so that they'd have approx. square shape in x and y. If you should try this together with the notebook above it'd make sense to register the tiles on binned down images (this can be set using mv_registration_bin_factors=[1, 1, 1]).

Adapting the registration to your configuration in combination with chunked HDF5 files (e.g. BDV) as input shouldn't be too hard. Do you have example files that can be downloaded somewhere, even if only a few tiles? I guess file size here is clearly a problem, but maybe you've found ways to share parts of datasets in the past.

dpshepherd commented 3 years ago

We can definitely write out in tiles that are approximately square. We have been using a feature of BigStitcher to virtually split the long aspect ratio tiles into smaller tiles after the initial registration, but have had internal conversations about writing tiles that are closer to square anyways.

Yes, we can share a smaller set of data. I'll ask someone in the group to get it together and make it available.

m-albert commented 3 years ago

Great! Looking forward to the example dataset.

dpshepherd commented 3 years ago

Just wanted to follow up on this. We got swamped with data collection and working on analysis code. Apologies for the delay on our side. I'm working through some of remaining issues with group members over the next couple weeks and will put a dataset up for you to download from our NAS.

Sorry again!

m-albert commented 3 years ago

@dpshepherd No problem at all, thanks for following up on this and looking forward to the dataset :)

m-albert / MVRegFus

Fusion of single-view data? #5