QI2lab / merfish3d-analysis

3D MERFISH data processing
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Issue with global registration #4

Closed mabbasi6 closed 3 months ago

mabbasi6 commented 4 months ago

Hi,

I ran the latest post-processing script on the 20240228 data and got this error in the global registration step:

Traceback (most recent call last):
  File "/home/mabbasi6/ondemand/data/sys/myjobs/projects/default/27/postprocess_momo.py", line 989, in <module>
    for val in func:
  File "/home/mabbasi6/ondemand/data/sys/myjobs/projects/default/27/postprocess_momo.py", line 624, in postprocess
    params = registration.register(
  File "/home/mabbasi6/.conda/envs/wf-merfish/lib/python3.10/site-packages/multiview_stitcher/registration.py", line 1553, in register
    g = mv_graph.build_view_adjacency_graph_from_msims(
  File "/home/mabbasi6/.conda/envs/wf-merfish/lib/python3.10/site-packages/multiview_stitcher/mv_graph.py", line 142, in build_view_adjacency_graph_from_msims
    overlap_results = compute(overlap_results, scheduler="processes")[0]
  File "/home/mabbasi6/.conda/envs/wf-merfish/lib/python3.10/site-packages/dask/base.py", line 661, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/home/mabbasi6/.conda/envs/wf-merfish/lib/python3.10/site-packages/multiview_stitcher/mv_graph.py", line 167, in get_overlap_between_pair_of_stack_props
    get_intersection_poly_from_pair_of_stack_props(
  File "/home/mabbasi6/.conda/envs/wf-merfish/lib/python3.10/site-packages/multiview_stitcher/mv_graph.py", line 408, in get_intersection_poly_from_pair_of_stack_props
    return cphs[0].intersection(cphs[1])
  File "/home/mabbasi6/.conda/envs/wf-merfish/lib/python3.10/site-packages/Geometry3D/geometry/body.py", line 14, in intersection
    return intersection(self, other)
  File "/home/mabbasi6/.conda/envs/wf-merfish/lib/python3.10/site-packages/Geometry3D/calc/intersection.py", line 138, in intersection
    return inter_convexpolyhedron_convexpolyhedron(a,b)
  File "/home/mabbasi6/.conda/envs/wf-merfish/lib/python3.10/site-packages/Geometry3D/calc/intersection.py", line 819, in inter_convexpolyhedron_convexpolyhedron
    return ConvexPolyhedron(tuple(cpg_set))
  File "/home/mabbasi6/.conda/envs/wf-merfish/lib/python3.10/site-packages/Geometry3D/geometry/polyhedron.py", line 208, in __init__
    raise ValueError('Check for the number of vertices, faces and edges fails, the polyhedron may not be closed')
ValueError: Check for the number of vertices, faces and edges fails, the polyhedron may not be closed

I appreciate it if you could take a look at this. @dpshepherd The previous version completed this step and generated the 'stitched' folder without any problems.

dpshepherd commented 4 months ago

That means the registration could not find a solution. Are you sure it ran with the data you are using before?

mabbasi6 commented 4 months ago

Yeah, it's the same dataset (20240228), and previous versions of the code made the 'stitched' output folder without issues.

dpshepherd commented 4 months ago

Also - you can compare differences in the function calls to the registration package and see if there is a difference. I don't know what parameters or changes you were using before. If you find a difference, please document it and we can discuss making it settable via a parameter dictionary.

I suggest making your own environment with the minimum packages plus multiview-stitcher to troubleshoot the parameters that way.

mabbasi6 commented 4 months ago

Sure.

dpshepherd commented 4 months ago

I don't have access to your data - so I don't know what dataset that is, what the tile overlap is, etc...

It may be that your data is setup different than ours, so there may need to be different optimized registration parameters. Once you figure it out, we can make those settings configurable.

dpshepherd commented 4 months ago

These settings work for 20% overlap, z steps of .31, and tiles that are relatively level in the focus plane across all tiles.

The settings are highly robust for our data across cell culture, mouse brain, and human olfactory bulb. You will want to check if the metadata differs for the data you are working with - that may be why it is failing.

mabbasi6 commented 4 months ago

I have an update: I checked our datasets, and we have a 20% overlap and z steps of 0.31 as well. However, I doubt whether I was getting stitched data even before. The processed folder '/processed_v2/stitched/' has 'tile000n.zarr' files in it, which contain 'scale0' to 'scale4' folders. But I believe they were supposed to be in a fused_round.zarr output format, so I'm a little unclear whether the previous version (like this commit) worked or not.

mabbasi6 commented 4 months ago

Since it crashes in the registration.register step, I tried changing the binning parameter and also removing the pre_registration_pruning_method parameter and those didn't work. What else could I change to see if it might work?

dpshepherd commented 4 months ago

I don't if the previous commit worked on your data, because I have not seen the data you are trying to process. Our code is no longer making the tile000n.zarr folders, because it now loads everything via Dask and fusing that way. This limits the disk I/O and speeds up the registration/fusion process a lot.

I suggest following this example starting from the Visualize pre-registered views section. Visualizing the tiles to make sure they are arranged correctly is the first step.

The other parameters to test are the level of downsampling in the fusion. It is possible you don't have much information in the area of tile overlap and need to use a different downsampling.

mabbasi6 commented 4 months ago

I tried checking the pre-registered views, and I used this code:

    if run_global_registration:
        if global_registration_parameters['data_to_fuse'] == 'all':
            fuse_readouts = True
        else:
            fuse_readouts = False    

        from multiview_stitcher import spatial_image_utils as si_utils
        from multiview_stitcher import msi_utils, fusion, registration, vis_utils
        import dask.diagnostics
        import dask.array as da

        fused_dir_path = output_dir_path / Path('fused')
        fused_dir_path.mkdir(parents=True, exist_ok=True)

        polyDT_dir_path = output_dir_path / Path('polyDT')
        readout_dir_path = output_dir_path / Path('readouts')

        tile_ids = sorted([entry.name for entry in polyDT_dir_path.iterdir() if entry.is_dir()],
                            key=lambda x: int(x.split('tile')[1].split('.zarr')[0]))

        msims = []
        for tile_idx, tile_id in enumerate(tqdm(tile_ids,desc='tile')):

            polyDT_current_path = polyDT_dir_path / Path(tile_id) / Path("round000.zarr")
            polyDT_current_tile = zarr.open(polyDT_current_path,mode='r')

            voxel_zyx_um = np.asarray(polyDT_current_tile.attrs['voxel_zyx_um'],
                                                dtype=np.float32)

            scale = {'z': voxel_zyx_um[0],
                    'y': voxel_zyx_um[1],
                    'x': voxel_zyx_um[2]}

            tile_position_zyx_um = np.asarray(polyDT_current_tile.attrs['stage_zyx_um'],
                                                dtype=np.float32)

            tile_grid_positions = {
                'z': tile_position_zyx_um[0],
                'y': tile_position_zyx_um[1],
                'x': tile_position_zyx_um[2],
            }

            with dask.config.set(**{'array.slicing.split_large_chunks': False}):
                im_data = []
                im_data.append(da.from_array(polyDT_current_tile['registered_decon_data']))

                readout_current_tile_path = readout_dir_path / Path(tile_id)
                bit_ids = sorted([entry.name for entry in readout_current_tile_path.iterdir() if entry.is_dir()],
                                        key=lambda x: int(x.split('bit')[1].split('.zarr')[0]))

                for bit_id in bit_ids:
                    bit_current_tile = readout_current_tile_path / Path(bit_id)
                    bit_current_tile = zarr.open(bit_current_tile,mode='r')
                    temp = da.where(da.from_array(bit_current_tile["registered_ufish_data"], meta=np.array([], dtype=np.float32)) > 0.01, 
                                    da.from_array(bit_current_tile["registered_decon_data"], meta=np.array([], dtype=np.float32)), 
                                    0.)
                    im_data.append(temp)
                    del temp
                    gc.collect()

                im_data = da.stack(im_data)

            if fuse_readouts:
                with dask.config.set(**{'array.slicing.split_large_chunks': False}):
                    sim = si_utils.get_sim_from_array(im_data,
                                                    dims=('c', 'z', 'y', 'x'),
                                                    scale=scale,
                                                    translation=tile_grid_positions,
                                                    transform_key='stage_metadata')            
            else:
                im_data = im_data[0,:]
                sim = si_utils.get_sim_from_array(da.expand_dims(im_data,axis=0),
                                                dims=('c','z', 'y', 'x'),
                                                scale=scale,
                                                translation=tile_grid_positions,
                                                transform_key='stage_metadata')

            msim = msi_utils.get_msim_from_sim(sim,scale_factors=[])
            msims.append(msim)
            del im_data
            gc.collect()

        debug_tile_positions = True
        if debug_tile_positions:

            fig, ax = vis_utils.plot_positions(
                msims,
                use_positional_colors=True, # set to False for faster execution in case of more than 20 tiles/views
                transform_key='stage_metadata'
                )

            plt.show()

And here is the error I got:

    623 debug_tile_positions = True
    624 if debug_tile_positions:
--> 626     fig, ax = vis_utils.plot_positions(
    627         msims,
    628         use_positional_colors=True, # set to False for faster execution in case of more than 20 tiles/views
    629         transform_key='stage_metadata'
    630         )
    632     plt.show()
    634 with dask.config.set(**{'array.slicing.split_large_chunks': False}):

File ~/.conda/envs/wf-merfish/lib/python3.10/site-packages/multiview_stitcher/vis_utils.py:70, in plot_positions(msims, transform_key, edges, edge_color_vals, edge_cmap, edge_clims, edge_label, use_positional_colors, n_colors, nscoord, display_view_indices)
     68 if use_positional_colors:
     69     pos_colors = ["red", "green", "blue", "yellow"]
---> 70     greedy_colors = mv_graph.get_greedy_colors(
     71         sims,
     72         n_colors=n_colors,
     73         transform_key=transform_key,
     74     )
     75     pos_colors = [
     76         pos_colors[greedy_colors[iview] % len(pos_colors)]
     77         for iview in range(len(msims))
     78     ]
     80 else:

File ~/.conda/envs/wf-merfish/lib/python3.10/site-packages/multiview_stitcher/mv_graph.py:514, in get_greedy_colors(sims, n_colors, transform_key)
    509 def get_greedy_colors(sims, n_colors=2, transform_key=None):
    510     """
    511     Get colors (indices) from view adjacency graph analysis
    512     """
--> 514     view_adj_graph = build_view_adjacency_graph_from_msims(
    515         [msi_utils.get_msim_from_sim(sim, scale_factors=[]) for sim in sims],
    516         expand=True,
    517         transform_key=transform_key,
    518     )
    520     # thresholds = threshold_multiotsu(overlaps)
    521 
    522     # strategy: remove edges with overlap values of increasing thresholds until
   (...)
    525     # modify overlap values
    526     # strategy: add a small amount to edge overlap depending on how many edges the nodes it connects have (betweenness?)
    528     view_adj_graph_pruned, greedy_colors = prune_graph_to_alternating_colors(
    529         view_adj_graph, n_colors=n_colors
    530     )

File ~/.conda/envs/wf-merfish/lib/python3.10/site-packages/multiview_stitcher/mv_graph.py:142, in build_view_adjacency_graph_from_msims(msims, transform_key, expand, pairs)
    136     overlap_results.append(overlap_result)
    138 # multithreading doesn't improve performance here, probably because the GIL is
    139 # not released by pure python Geometry3D code. Using multiprocessing instead.
    140 # Probably need to confirm here that local dask scheduler doesn't conflict
    141 # with dask distributed scheduler
--> 142 overlap_results = compute(overlap_results, scheduler="processes")[0]
    144 for pair, overlap_result in zip(pairs, overlap_results):
    145     overlap_area = overlap_result[0]

File ~/.conda/envs/wf-merfish/lib/python3.10/site-packages/dask/base.py:661, in compute(traverse, optimize_graph, scheduler, get, *args, **kwargs)
    658     postcomputes.append(x.__dask_postcompute__())
    660 with shorten_traceback():
--> 661     results = schedule(dsk, keys, **kwargs)
    663 return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)])

File ~/.conda/envs/wf-merfish/lib/python3.10/site-packages/multiview_stitcher/mv_graph.py:167, in get_overlap_between_pair_of_stack_props()
    158 """
    159 - if there is no overlap, return overlap area of -1
    160 - if there's a one pixel wide overlap, overlap_area is 0
    161 - assumes spacing is the same for sim1 and sim2
    162 """
    164 ndim = len(stack_props_1["origin"])
    166 intersection_poly_structure = (
--> 167     get_intersection_poly_from_pair_of_stack_props(
    168         stack_props_1, stack_props_2
    169     )
    170 )
    172 spacing = np.array(
    173     [stack_props_1["spacing"][dim] for dim in ["z", "y", "x"][-ndim:]]
    174 )
    175 small_length = np.min(spacing) / 10.0

File ~/.conda/envs/wf-merfish/lib/python3.10/site-packages/multiview_stitcher/mv_graph.py:408, in get_intersection_poly_from_pair_of_stack_props()
    406     return cphs[0]
    407 else:
--> 408     return cphs[0].intersection(cphs[1])

File ~/.conda/envs/wf-merfish/lib/python3.10/site-packages/Geometry3D/geometry/body.py:14, in intersection()
     12 """return the intersection between self and other"""
     13 from ..calc.intersection import intersection
---> 14 return intersection(self, other)

File ~/.conda/envs/wf-merfish/lib/python3.10/site-packages/Geometry3D/calc/intersection.py:138, in intersection()
    136 # convex polyhedron
    137 elif isinstance(a,ConvexPolyhedron) and isinstance(b,ConvexPolyhedron):
--> 138     return inter_convexpolyhedron_convexpolyhedron(a,b)
    139 elif isinstance(a, ConvexPolyhedron) and isinstance(b, HalfLine):
    140     return inter_convexpolyhedron_halfline(a,b)

File ~/.conda/envs/wf-merfish/lib/python3.10/site-packages/Geometry3D/calc/intersection.py:819, in inter_convexpolyhedron_convexpolyhedron()
    817 if len(cpg_set) > 1:
    818     get_main_logger().debug('cpg_set:{},length:{}'.format(cpg_set,len(cpg_set)))
--> 819     return ConvexPolyhedron(tuple(cpg_set))
    820 elif len(cpg_set) == 1:
    821     return list(cpg_set)[0]

File ~/.conda/envs/wf-merfish/lib/python3.10/site-packages/Geometry3D/geometry/polyhedron.py:208, in __init__()
    206 if not self._euler_check():
    207     get_main_logger().critical('V:{} E:{} F:{}'.format(len(self.point_set),len(self.segment_set),len(self.convex_polygons)))
--> 208     raise ValueError('Check for the number of vertices, faces and edges fails, the polyhedron may not be closed')

ValueError: Check for the number of vertices, faces and edges fails, the polyhedron may not be closed
dpshepherd commented 4 months ago

I would use a clean environment and make a new script to test just the stitching without our code. The plotting output needs to be before the registration, not at the end (see the notebook I linked).

Are you sure the overlap is 20%? Emma took some datasets at 10% and those probably won't register.

mabbasi6 commented 4 months ago

I used it is right before the registration.register step so it should be fine, no? I'm going to make a clean environment and try it again. Emma said she only did one experiment with 10% overlap but the rest are 20%.

dpshepherd commented 4 months ago

Ah, I'm not sure then. That error should not pop up with just the plotting. So when I saw it, that indicated the register call was executed.

Have you output the stage positions to make sure that they match the .csv files for each tile?

mabbasi6 commented 4 months ago

I guess there seems to be a problem with 4 tiles. I ran this code:

for tile_idx, tile_id in enumerate(tile_ids):
    polyDT_current_path = polyDT_output_dir_path / Path(tile_id) / Path("round000.zarr")
    polyDT_current_tile = zarr.open(polyDT_current_path,mode='r')
    tile_position_zyx_um = np.asarray(polyDT_current_tile.attrs['stage_zyx_um'],
                                        dtype=np.float32)
    print(tile_id)
    print(tile_position_zyx_um)

and here is the output:

tile0000
[ 1.00000e-01 -6.74400e+02  4.13846e+03]
tile0001
[ 4.00000e-01 -5.46720e+02  4.14806e+03]
tile0002
[ 5.00000e-01 -4.19320e+02  4.15775e+03]
tile0003
[ 4.00000e-01 -2.92160e+02  4.16736e+03]
tile0004
[4225.9  -165.55 4176.86]
tile0005
[4226.1   -37.97 4186.48]
tile0006
[4226.    -50.57 4377.76]
tile0007
[4226.   -177.63 4368.06]
tile0008
[4226.   -305.03 4358.38]
tile0009
[4226.   -431.82 4349.03]
tile0010
[4226.   -559.06 4339.26]
tile0011
[4226.   -687.   4329.48]
tile0012
[4226.   -698.97 4520.86]
tile0013
[4226.   -571.47 4530.46]
tile0014
[4226.   -444.25 4540.06]
tile0015
[4226.   -317.2  4550.01]
tile0016
[4226.   -190.05 4559.36]
tile0017
[4226.    -62.37 4569.4 ]
tile0018
[4226.    -75.07 4760.43]
tile0019
[4226.   -202.39 4751.08]
tile0020
[4226.   -329.61 4741.13]
tile0021
[4226.   -457.2  4731.35]
tile0022
[4226.   -584.   4721.83]
tile0023
[4226.   -711.48 4712.06]
tile0024
[4225.9  -723.91 4903.43]
tile0025
[4226.   -596.6  4912.95]
tile0026
[4225.9  -469.1  4922.63]
tile0027
[4226.   -341.86 4932.51]
tile0028
[4225.9  -214.55 4942.28]
tile0029
[4225.9   -88.02 4951.71]
tile0030
[4226.   -100.17 5142.91]
tile0031
[4225.9  -226.96 5133.75]
tile0032
[4225.9  -354.2  5123.61]
tile0033
[4225.9  -481.96 5114.28]
tile0034
[4226.   -608.92 5104.66]
tile0035
[4226.   -736.6  5094.98]
tile0036
[4225.9  -749.37 5285.73]
tile0037
[4225.9  -621.43 5295.78]
tile0038
[4225.9  -494.47 5305.56]
tile0039
[4225.9  -367.23 5314.81]
tile0040
[4225.9  -239.66 5324.51]
tile0041
[4225.9  -112.08 5334.28]

it seems like the stage_z is off for the first 4 tiles, but they match the csv files (attached as zip file). stage positions.zip

dpshepherd commented 4 months ago

Ok, so something went wrong with the experiment - not the post processing code.

What does the raw data look like for those tiles compared to ones with an actual Z value?

mabbasi6 commented 4 months ago

We are not seeing anything on these 4 tiles, and Emma thinks it was a problem with resetting the Z position from the grid from the Micromagellan. She thinks she missed those 4. I re-ran the code to visualize the pre-registered views and set use_positional_colors=False and I'm not getting the error anymore for it and looks like this:

Screenshot 2024-05-08 at 2 23 06 PM

Maybe we could discard tiles 0 to 5 and go with the rest?

dpshepherd commented 4 months ago

Yes. I've never tried that, so there may be errors we haven't explored

dpshepherd commented 3 months ago

I'm closing this for now, since we determined it is a data issue.