fractal-analytics-platform / fractal-tasks-core

Main tasks for the Fractal analytics platform
https://fractal-analytics-platform.github.io/fractal-tasks-core/
BSD 3-Clause "New" or "Revised" License
14 stars 6 forks source link

Update tasks to Fractal 2.0 #671

Closed jluethi closed 5 months ago

jluethi commented 6 months ago

Closes #669 Closes #682 Closes #324 Closes #415 Closes #299 Closes #523 Closes #535 Closes #686 Closes #674 Closes #691

Discussion

To what degree are we giving deprecation warnings or just changing some of the fractal-tasks-core functions? In general, public functions should stay stable. But we probably have some functions that shouldn't be public functions. For example, utils.get_table_path_dict: I'm now updating it to follow the new behavior. But also: zarr_url vs. zarrurl (e.g. in pyramid building) and other functions. I would expect this to become the 1.0 release of fractal-tasks-core. We have some other libraries depending on it, but not too many yet. Going forward, we need to take care not to break any public functions. This may be a good time to make some of those functions private now though.

To review

General recommendation: Create a dict, optionally validate it with a model

Handling on the task side: Avoid putting in None when (exclude_Nones, exclude_unset) => How do we handle the complexities in InitArgs? In general, try to keep InitArgs simple

In general, the less hidden behaviors the better! We need to document any hidden behaviors or ways of doing the same thing in 2 ways.

Cleanup

Checklist before merging

github-actions[bot] commented 6 months ago

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  fractal_tasks_core
  channels.py
  utils.py
  fractal_tasks_core/ngff
  specs.py
  zarr_utils.py 98-104, 108-113
  fractal_tasks_core/tables
  v1.py 290, 317-324
  fractal_tasks_core/tasks
  _registration_utils.py 62, 209
  _utils.py
  apply_registration_to_image.py 163
  calculate_registration_image_based.py
  cellpose_segmentation.py 369
  cellvoyager_to_ome_zarr_compute.py 208
  cellvoyager_to_ome_zarr_init.py
  cellvoyager_to_ome_zarr_init_multiplex.py 139
  copy_ome_zarr_hcs_plate.py 231, 295-297
  find_registration_consensus.py 106, 118, 166-168
  illumination_correction.py 146, 273
  image_based_registration_hcs_init.py 92-94
  import_ome_zarr.py 212
  init_group_by_well_for_multiplexing.py 61, 86-88
  io_models.py
  maximum_intensity_projection.py 149-162
  napari_workflows_wrapper.py
Project Total  

This report was generated by python-coverage-comment-action

jluethi commented 6 months ago

An interesting observation: Using the new zarr_url scheme, we can drop Path as an import from many of these tasks => promising for the direction of having cloud-compatible tasks! :)

jluethi commented 6 months ago

Should we rename the type "3D" to "is_3D" or something similar? To avoid having to switch to different dictionary approach

jluethi commented 6 months ago

I rewrote the Copy OME-Zarr task from scratch and changed the work distribution between copy & MIP parts => now more happens in the compute (MIP part). The copy only sets up the plate & well with the correct metadata. But in the new version, it's now explicitly specific for HCS plates, validates HCS metadata and writes the plate just with the wells that were passed with the zarr_urls (=> should open up for making this copy task an easy route for subset testing).

Having to handle the metadata of the OME-Zarrs lead to me introducing a few new Plate spec pydantic classes & copying over some of my helper functions from the navigator project (see https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/678). Will be good to review which parts of those we want in Fractal-tasks-core.

One thing I'd like to review is whether this is the intended & simplest image_list_update we can provide:

    image_list_update_dict = dict(
        image_list_updates=[
            dict(
                zarr_url=zarr_url,
                origin=init_args.origin_url,
                types=dict(
                    is_3D=False,
                ),
            )
        ]
    )

Having to provide a dict of a list of a dict seems kind of cumbersome. I understand why it needs to be a list of dicts, as some tasks will provide updates to multiple images. Having this list in a dict with the only key image_list_updates is there so we know whether a task provided an image_list_updates or a parallelization_list I assume?

jluethi commented 6 months ago

@tcompa I need to build the new manifest tomorrow. Besides that, I think the tasks-core package would be mostly ready for examples 01 & 02.

Importing & multiplexing registration are still TBD, as well as a long list of tests & points above of course :)

jluethi commented 6 months ago

@tcompa I now added the draft task list. Maybe I'll find some time this afternoon to work on import & registration some more. Otherwise, this should be ready for testing manifest building & then also for running example 01 as far as I can say. (I'm sure we'll discover some bugs in the process, but the tasks follow the new API and the tests are updated for them at least)

jluethi commented 5 months ago

Closes #674

jluethi commented 5 months ago

I made the Find Registration Consensus a compound task now to provide an example of parallelizing over a well, which may be needed in some of Adrian's tasks quite often

jluethi commented 5 months ago

@tcompa I have an initial version of all the tasks ready now. The MIP task uses the old approach to get attributes again. The tests are passing, even one that runs cellpose after MIP, but I'll be curious to see if example 01 runs. I'll try to test that tomorrow morning :)

There are a few remaining parts on activating testing functionality, could you have a look at those?

Other than that, the updates for manifest creation to make it slightly more flexible can either be part of this PR or a future PR. And I've create new issues for all the future goals in #669 .

tcompa commented 5 months ago

There are a few remaining parts on activating testing functionality, could you have a look at those?

Update & reenable manifest validation in Github CI Update & reenable test_valid_args_schemas Update test_task_interface in test_valid_task_interface once new Manifest is built

This are now covered, as of #689.

tcompa commented 5 months ago

I'll be curious to see if example 01 runs. I'll try to test that tomorrow morning

The cellpose task now runs: image

Note that this test is currently a bit cumbersome, until we merge into main and make a pre-release.

One way is to use fractal-containers.

Another way is to run poetry build from the root directory of the fractal-tasks-core repo (within the V2_tasks branch), so that a wheel file is generated in the dist folder (and can be used for local-wheel task collection in fractal-web/server).

tcompa commented 5 months ago

Also napari-workflows runs, in fractal-demos (I did not check the output): image

jluethi commented 5 months ago

Awesome!

Another way is to run poetry build from the root directory of the fractal-tasks-core repo (within the V2_tasks branch), so that a wheel file is generated in the dist folder (and can be used for local-wheel task collection in fractal-web/server).

Yes, also just setting up tests locally with server 2.0.0a4 and building the task package locally with python -m build :)

jluethi commented 5 months ago

I can confirm that I can also run example V1! =D

From my side, ready to merge after your review then. And I'll start exploring other examples and will open new issues where they occur :)

tcompa commented 5 months ago

Note: I pushed 34faa92e35f717a4a0060a2192ed31b201676125, with these changes:

Improve plate-related ngff models:

* Add `AcquisitionInPlate.description`;
* Make `AcquisitionInPlate.id` required;
* Add `ColumnInPlate`;
* Add `RowInPlate`.
tcompa commented 5 months ago

While reviewing the copy_ome_zarr_hcs_plate task, some parts (like the metadata generation) were a bit hard to understand. As a minor fix, I started using the load helper functions (including a new load_NgffPlateMeta one) and added a few comments - see this commit .

In the future this task would benefit from yet another review.

jluethi commented 5 months ago

Thanks a lot for the review & the fixes @tcompa !