`apply_inverse_transfer_function_cli` is not very modular, blocking multiprocessing

talonchandler commented 1 year ago

@edyoshikun's attempt to implement multiprocessing in #392 was more challenging than it needed to be because apply_inverse_transfer_function_cli is not very modular.

I would like to take this ~350 line function and break it into smaller functions with:

one function per reconstruction type: fluorescence, phase, birefringence, birefringence-and-phase
a consistent return shape of CZYX
basic tests for each function
a get_reconstruction_output_shape function (which @edyoshikun has already prototyped in #392)

After a discussion with @edyoshikun, fixing this issue together with a version of #388 will tee up a very simple implementation of multiprocessed reconstructions over timepoints.

@edyoshikun please comment if you see any holes in this plan.

edyoshikun commented 1 year ago

After our discussion last week, I think that this PR is certainly required to have a clean implementation of #392.

The biggest issue right now is the fact that all reconstructions return different czyx shape (i.e birefringence has 3 dim (4,z,y,x), phase and bire 2D return a 2 dim (y,x), and phase and bire 3D return 3 dim (z,y,x)). I think that the waveorder functions are returning the right shapes, but this makes integration with recorder a bit complicated if we want to have a generalized multiprocessing function. Hence, the Frankenstein array indexing and complexities I had to do in #392.

One solution would be to have individual reconstruction functions that pad the arrays. What I am envisioning is something like the one mentioned above.

def apply_inverse_birefringence(czyx, apply_inverse_bire_args):
      reconstruction_birefringence_zyx = (
              inplane_oriented_thick_pol3d.apply_inverse_transfer_function(
                  czyx, **apply_inverse_bire_args
              )
      ) # this is a tuple of tensors
      #convert tuple of tensor to array
      reconstruction_array = np.array([tensor.numpy() for tensor in reconstruction_birefringence_zyx])

      # logic to pad the array if ndim!=4
      #

      return reconstruction_array    #(4,z,y,x)

In the case of the functions that return a 3D arrays and 2D arrays, I think we can implement a pad_array_dimensions() that always returns a 4D array (czyx ).

To write them into the output_dataset we typically slice the output_dataset[t_idx,c_idx] = reconstruction_array. Since we are thinking of an approach in #388 where the t_idx will be a variable in the configuration, this works well too.

These functions are the ones passed through the process_single_position(func, kwargs) and the reconstruct_zyx_and_save(czyx,func,kwargs) in #392, so this would simplify the code a lot. My only worry with this is that the waveorder methods we are wrapping will be a shallow implementation that can lead to some of the headaches we had with previous recOrder...

edyoshikun commented 1 year ago

This function also does the bire and phase in one step and returns the (5,z,y,x) which is convenient, as we can set output_dataset[t_idx,:] = bire_and_phase3D.

def apply_inverse_biref_phase_3D(czyx, birefringence_args, phase3D_args):
    reconstruction_birefringence_zyx = (
        inplane_oriented_thick_pol3d.apply_inverse_transfer_function(
            czyx, **birefringence_args
        )
    )

    brightfield_3d = reconstruction_birefringence_zyx[2]

    reconstruction_phase_zyx = phase_thick_3d.apply_inverse_transfer_function(
        brightfield_3d, **phase3D_args
    )

    reconstruction_bire_phase_3D_zyx = reconstruction_birefringence_zyx + (
        reconstruction_phase_zyx,
    )
    reconstruction_array = np.array([tensor.numpy() for tensor in reconstruction_bire_phase_3D_zyx])

    return reconstruction_array

talonchandler commented 1 year ago

Thanks @edyoshikun. I agree with your comments, and I'll respond to:

My only worry with this is that the waveorder methods we are wrapping will be a shallow implementation that can lead to some of the headaches we had with previous recOrder...

I do not think that recOrder's wrapping of waveorder calls was the source of the main challenges we faced on the previous iteration.

recOrder performs a subset of the reconstruction types that waveorder performs, so some amount of wrapping is necessary. For example, waveorder allows 3D reconstructions with arbitrary axial samples, while recOrder only allows 3D reconstructions with equally spaced axial samples. The wrapping code that turns recOrder-CLI reconstruction calls into waveorder reconstruction calls should continue to live in recOrder.

If we want to add new reconstruction functionality (e.g. the rotations, flips, and phase-contrast inversion), we should do it on the waveorder side first, then connect it on the recOrder side. We're on our way to a place where the recOrder connections will be increasingly automated.

talonchandler commented 1 year ago

@edyoshikun and I discussed this and we want this and multiprocessing in 0.4.0 to support the mantis project.

I will complete this refactor, and @edyoshikun is going to finish a variant of #392 to fully support multiprocessing in recOrder.

mehta-lab / recOrder

`apply_inverse_transfer_function_cli` is not very modular, blocking multiprocessing #396