fractal-analytics-platform / fractal-tasks-core

Main tasks for the Fractal analytics platform
https://fractal-analytics-platform.github.io/fractal-tasks-core/
BSD 3-Clause "New" or "Revised" License
14 stars 6 forks source link

Addition of chi2_shift as a Method to calculate Shifts to the "Calculate Registration (image-based)" Task #741

Closed adrtsc closed 5 months ago

adrtsc commented 6 months ago

The chi2_shift function from the image_registration python package (https://pypi.org/project/image-registration/) was used for a long time for image registration on TissueMAPS and performed very well in a task I had in the APx fractal task collection. After discussions with Joel, we decided that we would like to include that option in the "Calculate Registration (image-based)" task in fractal-tasks-core besides the currently available phase_cross_correlation function.

Happy to receive feedback on things that are still missing or should be cleaned up.

Checklist before merging

tcompa commented 5 months ago

Hi there, I included a few changes:

  1. I merged from upstream main
  2. I updated the poetry.lock file
  3. I ran pre-commit on the code (https://github.com/fractal-analytics-platform/fractal-tasks-core/pull/741/commits/8548573a5d38d0b86ab46e9bbf3b9ba371246fe5)
  4. I modified the return type and docstring of chi2_shift_out (which was making the docs build fail).

@adrtsc can you please review https://github.com/fractal-analytics-platform/fractal-tasks-core/pull/741/commits/fba4dbafba528053987803564832c7cf38f46a44? I think the returned object is a list of numpy arrays, but the docstring says "List of tuple of shift in y and x direction.".

jluethi commented 5 months ago

Nice PR @adrtsc , I like the approach you took! I'll leave some comments here while I review this.

First one: Is there are reason against having a default registration method? I'd set phase_cross_correlation as default, as it works in 2D & 3D cases. But open to revert this if there are strong feelings against it :)

jluethi commented 5 months ago

The chi2_shift method fails with a ValueError: too many values to unpack (expected 2) if it's run on a real 3D array. I've now added tests for that and a check that makes it fail earlier and with an explicit ValueError describing the limitation.

jluethi commented 5 months ago

Another thing I've noticed: Pylance type checking isn't happy about the way Literal is used based on a dict in the function definition. See long discussion on the Cellpose issues on this. For Cellpose, we didn't really have a different choice (at least back then, maybe now with Enum support we could also solve it differently?). I'll try switching the REG_METHODS dict to an Enum and using the Enum as task input to make that work.

jluethi commented 5 months ago

I'm mostly happy with this PR now, the functionality is useful to have for sure! The last thing I'll want to review but won't manage tonight anymore is the weird typing issue in chi2_shift_out.

Given your comment @adrtsc :

 """
running into issues when using direct float output for fractal.
When rounding to integer and using integer dtype, it typically works
but for some reasons fails when run over a whole 384 well plate (but
the well where it fails works fine when run alone). The original verison
works fine however. Trying to round to integer, but still use float64
dtype like original version.
"""
shifts = np.array([-int(np.round(y)), -int(np.round(x))], dtype="float64")

It looks like there is some complexity here we don't understand well. I'll want to test the typing we get from the phase_crosscorrelation function and see if we can get that behavior tested. Kind of weird that values need to be rounded to int actually.

@adrtsc Do you still have the error traceback you got when not casting those values to a different type?

adrtsc commented 5 months ago

Thanks for looking into this PR @tcompa and @jluethi ! All very good points so far.

I'm on holidays currently, but will go over it and respond in detail once I'm back!

jluethi commented 5 months ago

@adrtsc Sure, no stress & enjoy your holidays! Sorry also for the delay on my side to look into this PR.

I'll try to find some time still to look into the types mentioned above. Maybe I'll manage that before you return from your holidays ;)

jluethi commented 5 months ago

I've tried to test the typing questions more. I don't understand how the original types of chi2_shift could be the issue here.

See example test here:

from fractal_tasks_core.tasks._registration_utils import chi2_shift_out
def test_chi2_shift_out():
    """
    Test the chi2_shifts function that wraps chi2_shift calculation
    """
    array1 = np.zeros((100, 100), dtype=int)
    array2 = np.zeros((100, 100), dtype=int)
    array1[10:30, 10:30] = 1
    array2[20:40, 20:40] = 1
    from image_registration import chi2_shift
    shifts_orig = chi2_shift(array1, array2)
    shifts_orig_list = [np.array([-x for x in shifts_orig[:2]])]

    shifts = chi2_shift_out(array1, array2)
    from devtools import debug
    debug(shifts_orig_list)
    debug(shifts)
    debug(shifts_orig_list[0][0])
    debug(shifts[0][0])

Leads to output:

tests/tasks/test_unit_registration_helper_functions.py:340 test_chi2_shift_out
    shifts_orig_list: [
        array([ -9.99804688, -10.00195312]),
    ] (list) len=1
tests/tasks/test_unit_registration_helper_functions.py:341 test_chi2_shift_out
    shifts: [
        array([-10., -10.]),
    ] (list) len=1
tests/tasks/test_unit_registration_helper_functions.py:342 test_chi2_shift_out
    shifts_orig_list[0][0]: -9.998046875 (float64)
tests/tasks/test_unit_registration_helper_functions.py:343 test_chi2_shift_out
    shifts[0][0]: -10.0 (float64)

Thus: The output of chi2_shift always seems to be floats, both from the direct function output (that needs to be inverted, but otherwise seems already fine) & from your wrapper @adrtsc

The casting it to int and then back to float obviously rounds to the next integer. I don't see a harm in rounding to the next integer in general. But I don't understand how it would prevent any kind of error tbh...


tl;dr: I don't mind the rounding, but not sure I'd expect it to prevent any errors. Thus, I'd be fine with merging this now after you review @adrtsc . Also fine if we want to dig into anything on the typing here.

jluethi commented 5 months ago

I tested this and it also closes the longstanding #511 through one of my minor changes above

adrtsc commented 5 months ago

Finally back from holidays and found some time to look over it!

Hi there, I included a few changes:

1. I merged from upstream `main`

2. I updated the `poetry.lock` file

3. I ran `pre-commit` on the code ([8548573](https://github.com/fractal-analytics-platform/fractal-tasks-core/commit/8548573a5d38d0b86ab46e9bbf3b9ba371246fe5))

4. I modified the return type and docstring of `chi2_shift_out` (which was making the docs build fail).

@adrtsc can you please review fba4dba? I think the returned object is a list of numpy arrays, but the docstring says "List of tuple of shift in y and x direction.".

Looks good, thanks for the updates and catching the wrong return type and error in the docstring. The docstring now also mentions the correct type.

I've tried to test the typing questions more. I don't understand how the original types of chi2_shift could be the issue here.

See example test here:

from fractal_tasks_core.tasks._registration_utils import chi2_shift_out
def test_chi2_shift_out():
    """
    Test the chi2_shifts function that wraps chi2_shift calculation
    """
    array1 = np.zeros((100, 100), dtype=int)
    array2 = np.zeros((100, 100), dtype=int)
    array1[10:30, 10:30] = 1
    array2[20:40, 20:40] = 1
    from image_registration import chi2_shift
    shifts_orig = chi2_shift(array1, array2)
    shifts_orig_list = [np.array([-x for x in shifts_orig[:2]])]

    shifts = chi2_shift_out(array1, array2)
    from devtools import debug
    debug(shifts_orig_list)
    debug(shifts)
    debug(shifts_orig_list[0][0])
    debug(shifts[0][0])

Leads to output:

tests/tasks/test_unit_registration_helper_functions.py:340 test_chi2_shift_out
    shifts_orig_list: [
        array([ -9.99804688, -10.00195312]),
    ] (list) len=1
tests/tasks/test_unit_registration_helper_functions.py:341 test_chi2_shift_out
    shifts: [
        array([-10., -10.]),
    ] (list) len=1
tests/tasks/test_unit_registration_helper_functions.py:342 test_chi2_shift_out
    shifts_orig_list[0][0]: -9.998046875 (float64)
tests/tasks/test_unit_registration_helper_functions.py:343 test_chi2_shift_out
    shifts[0][0]: -10.0 (float64)

Thus: The output of chi2_shift always seems to be floats, both from the direct function output (that needs to be inverted, but otherwise seems already fine) & from your wrapper @adrtsc

The casting it to int and then back to float obviously rounds to the next integer. I don't see a harm in rounding to the next integer in general. But I don't understand how it would prevent any kind of error tbh...

tl;dr: I don't mind the rounding, but not sure I'd expect it to prevent any errors. Thus, I'd be fine with merging this now after you review @adrtsc . Also fine if we want to dig into anything on the typing here.

I looked into this again a bit. So it appears that, although returning floats, the phase cross correlation function also rounds the values and always returns whole values. This may be the reason there was never an issue with that function. For now, I think it's safest to leave the rounding in there, as this worked fine so far. Unfortunately, I don't have the error message anymore from back when it failed randomly without rounding the values.

I'm fine with all the updates by you guys so from my perspective this is good to go once you've had a final look!

jluethi commented 5 months ago

Great, thanks a lot @adrtsc . I'm merging this to main then and it will become part of the 1.1.0 release of Fractal tasks core :)

Thanks for contributing this chi2_shift method back into core!