Make masked-loading helper functions more general

tcompa commented 1 year ago

The first required change will be to make the input shapes more flexible, since they currently (as part of #306) only work with the expected shapes of cellpose inputs (i.e. 4D arrays, at the moment).

tcompa commented 1 year ago

From https://github.com/fractal-analytics-platform/fractal-tasks-core/pull/306#discussion_r1133649915:

Let's try to avoid dimensionality assumptions within our processing function, especially when we want to make them more general. I think this one should hold for all cellpose processing of labels, but e.g. if we applied it to intensity images, those can already be 4D when they are multi-channel

jluethi commented 1 year ago

Following up on the PR discussions: If the masked loading helper function turns out to be too complex for what we're trying to achieve, I'm open to return back to having tasks call the pre and postprocessing separately. It would be nice to abstract this information from the tasks and I have been pushing for this. But if the complexity turns out not to be worth it, we can discuss reverting this and going back to the more explicit pre/post approach

tcompa commented 1 year ago

On this last higher-level comment: For our current single use case (the cellpose task), there's no real reason for the wrapper; but we should wait and review this point as soon as we explore at least a second use case. If the wrapper (or a refactored version) applies to at least two cases in some logical way, it's definitely useful to have it.

tcompa commented 1 year ago

Something else to keep in mind:

The current wrapper looks like

def masked_loading_wrapper(
    *,
    function: Callable,
    image_array: np.ndarray,
    kwargs: Optional[dict] = None,
    use_masks: bool,
    preprocessing_kwargs: Optional[dict] = None,
):
    """
    Wrap a function with some pre/post-processing functions

    :param function: The callable function to be wrapped.
    :param image_array: The image array to be preprocessed and then used as
                        positional argument for ``function``.
    :param kwargs: Keyword arguments for ``function``.
    :param use_masks: If ``False``, the wrapper only calls ``function(*args,
                      **kwargs)``.
    :param preprocessing_kwargs: Keyword arguments for the preprocessing
                                 function (see call signature of
                                 ``_preprocess_input()``).
    """
    # Optional preprocessing
    if use_masks:
        preprocessing_kwargs = preprocessing_kwargs or {}
        (
            image_array,
            background_3D,
            current_label_region,
        ) = _preprocess_input(image_array, **preprocessing_kwargs)
    # Run function
    kwargs = kwargs or {}
    new_label_img = function(image_array, **kwargs)
    # Optional postprocessing
    if use_masks:
        new_label_img = _postprocess_output(
            modified_array=new_label_img,
            original_array=current_label_region,
            background=background_3D,
        )
    return new_label_img

To make it clear, this requires that function takes a single positional argument (most likely an array) and then only keyword arguments. This makes it easy for the wrapper to also work fine when use_masks=False (where we just call function(image_array, **kwargs), without any pre/post-processing), but it's not the most general way to design a wrapper.. what if the wrapped functions takes 3 positional arguments?

Let's review this choice as soon as we have a second use case (after the cellpose task). If it turns out that it is too strict, we can update it. A simple option would be to defer the handling of the use_masks=False case to the task itself. Then the task code (in the current cellpose use case) would look something like

        if use_masks:
            new_label_img = masked_loading_wrapper(
            image_array=img_np,
            function=segment_ROI,
            kwargs=kwargs_segment_ROI,
            use_masks=use_masks,
            preprocessing_kwargs=preprocessing_kwargs,
        )
        else:
            new_label_img = segment_ROI(img_np, **kwargs_segment_ROI)

and it would be simpler to make it work when function takes an arbitrary number of positional/keyword arguments.

jluethi commented 1 year ago

Good ideas! Let's look at it when we tackle adding support for doing this with napari workflow wrapper. Or when we add another task with somewhat simpler handling, because napari workflows is quite complex.

fractal-analytics-platform / fractal-tasks-core

Make masked-loading helper functions more general #340