fractal-analytics-platform / fractal-tasks-core

Main tasks for the Fractal analytics platform
https://fractal-analytics-platform.github.io/fractal-tasks-core/
BSD 3-Clause "New" or "Revised" License
11 stars 5 forks source link

Review NGFF models & Fractal OME-Zarr models #681

Open tcompa opened 2 months ago

tcompa commented 2 months ago

Within #671, it appears that the MIP-images zattrs do not include wavelength_id in the omero metadata.

Here is how this leads to a cellpose-task error:

TASK ERROR: Task name: Cellpose Segmentation, position in Workflow: 2
TRACEBACK:
2024-04-09 10:02:11,120; INFO; START cellpose_segmentation task
2024-04-09 10:02:11,120; INFO; zarr_url='/somewhere/Fractal/fractal-demos/examples/01_cardio_tiny_dataset/output_cardiac-tiny-c/20200812-CardiomyocyteDifferentiation14-Cycle1_mip.zarr/B/03/0'
2024-04-09 10:02:11,133; INFO; NGFF image has num_levels=5
2024-04-09 10:02:11,133; INFO; NGFF image has coarsening_xy=2
2024-04-09 10:02:11,133; INFO; NGFF image has full-res pixel sizes [1.0, 0.1625, 0.1625]
2024-04-09 10:02:11,133; INFO; NGFF image has level-2 pixel sizes [1.0, 0.65, 0.65]
2024-04-09 10:02:11,133; CRITICAL; {'color': '00FFFF', 'label': 'DAPI', 'window': {'end': 800.0, 'max': 65535.0, 'min': 0.0, 'start': 110.0}}
Traceback (most recent call last):
  File "/somewhere/Fractal/fractal-server/tests/data/example_server_startup/Tasks/.fractal/fractal-tasks-core1.0.0a0/venv/lib/python3.10/site-packages/fractal_tasks_core/tasks/cellpose_segmentation.py", line 719, in <module>
    run_fractal_task(
  File "/somewhere/Fractal/fractal-server/tests/data/example_server_startup/Tasks/.fractal/fractal-tasks-core1.0.0a0/venv/lib/python3.10/site-packages/fractal_tasks_core/tasks/_utils.py", line 79, in run_fractal_task
    metadata_update = task_function(**pars)
  File "pydantic/decorator.py", line 40, in pydantic.decorator.validate_arguments.validate.wrapper_function
  File "pydantic/decorator.py", line 134, in pydantic.decorator.ValidatedFunction.call
  File "pydantic/decorator.py", line 206, in pydantic.decorator.ValidatedFunction.execute
  File "/somewhere/Fractal/fractal-server/tests/data/example_server_startup/Tasks/.fractal/fractal-tasks-core1.0.0a0/venv/lib/python3.10/site-packages/fractal_tasks_core/tasks/cellpose_segmentation.py", line 342, in cellpose_segmentation
    tmp_channel: OmeroChannel = get_channel_from_image_zarr(
  File "/somewhere/Fractal/fractal-server/tests/data/example_server_startup/Tasks/.fractal/fractal-tasks-core1.0.0a0/venv/lib/python3.10/site-packages/fractal_tasks_core/channels.py", line 221, in get_channel_from_image_zarr
    omero_channels = get_omero_channel_list(image_zarr_path=image_zarr_path)
  File "/somewhere/Fractal/fractal-server/tests/data/example_server_startup/Tasks/.fractal/fractal-tasks-core1.0.0a0/venv/lib/python3.10/site-packages/fractal_tasks_core/channels.py", line 243, in get_omero_channel_list
    channels = [OmeroChannel(**c) for c in channels_dicts]
  File "/somewhere/Fractal/fractal-server/tests/data/example_server_startup/Tasks/.fractal/fractal-tasks-core1.0.0a0/venv/lib/python3.10/site-packages/fractal_tasks_core/channels.py", line 243, in <listcomp>
    channels = [OmeroChannel(**c) for c in channels_dicts]
  File "pydantic/main.py", line 341, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for OmeroChannel
wavelength_id
  field required (type=value_error.missing)
tcompa commented 2 months ago

I think the reason is that our ngff models include


class Channel(BaseModel):
    """
    Model for an element of `Omero.channels`.

    See https://ngff.openmicroscopy.org/0.4/#omero-md.
    """

    window: Optional[Window] = None
    label: Optional[str] = None
    family: Optional[str] = None
    color: str
    active: Optional[bool] = None

class Omero(BaseModel):
    """
    Model for `NgffImageMeta.omero`.

    See https://ngff.openmicroscopy.org/0.4/#omero-md.
    """

    channels: list[Channel]

where wavelength_id is not present.

The reason why this was not an issue in previous versions is that we had


                # Replicate image attrs
                old_image_group = zarr.open_group(
                    f"{zarrurl_old}/{well_path}/{image_path}", mode="r"
                )
                new_image_group = zarr.group(
                    f"{zarrurl_new}/{well_path}/{image_path}"
                )
                new_image_group.attrs.put(old_image_group.attrs.asdict())

while we now have

new_image_group.attrs.put(ngff_image.dict(exclude_none=True))
tcompa commented 2 months ago

Ref https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/540

jluethi commented 2 months ago

Let's review whether we can get our models more consistent for this (e.g. having FractalChannels inherited from Channels etc.)

jluethi commented 2 months ago

I put back a version of just using the attrs of the old image for the moment with https://github.com/fractal-analytics-platform/fractal-tasks-core/commit/4aac820bf45f7b56afbe33f872b7e486593edc78

It's slightly less elegant, as we have to load the metadata twice now (once for the NGFF model for things like number of pyramid levels, once for adding it to the new image). But with that, it's hopefully not a blocker for V2 anymore but something we can address carefully when we get a chance.

[I still think we should find a good way of using those very useful Pydantic models for this purpose, e.g. having Fractal pydantic models we can use. Just wanted to make the registration testing work ;)]