fractal-analytics-platform / fractal-tasks-core

Main tasks for the Fractal analytics platform
https://fractal-analytics-platform.github.io/fractal-tasks-core/
BSD 3-Clause "New" or "Revised" License
12 stars 6 forks source link

Disable or improve multiple resources in input_paths functionality? #535

Closed jluethi closed 4 months ago

jluethi commented 10 months ago

Currently, the Create OME-Zarr task can run for datasets with many resources. It creates an OME-Zarr folder for each.

This is not well tested in downstream processing though: For example, this is a user-reported bug for running with 3 resources for the Convert Yokogawa to OME-Zarr task.

``` [2:14 PM] TASK ERROR:Task id: 370 (Convert Yokogawa to OME-Zarr), e.workflow_task_order=1 TRACEBACK: Traceback (most recent call last): File "/path/to/fractal-server-1.3.5/lib64/python3.9/site-packages/fractal_server/app/runner/_common.py", line 391, in call_single_parallel_task raise e File "/path/to/fractal-server-1.3.5/lib64/python3.9/site-packages/fractal_server/app/runner/_common.py", line 384, in call_single_parallel_task _call_command_wrapper( File "/path/to/fractal-server-1.3.5/lib64/python3.9/site-packages/fractal_server/app/runner/_common.py", line 183, in _call_command_wrapper raise TaskExecutionError(err) fractal_server.app.runner.common.TaskExecutionError: 2023-09-22 13:23:53,776; INFO; START yokogawa_to_ome_zarr task 2023-09-22 13:23:53,976; INFO; [glob_with_multiple_patterns] patterns=['*_B02_*.tif'] 2023-09-22 13:23:53,978; INFO; [glob_with_multiple_patterns] Found 0 items Traceback (most recent call last): File "/path/to/.fractal/fractal-tasks-core0.11.0/venv/lib/python3.9/site-packages/fractal_tasks_core/tasks/yokogawa_to_ome_zarr.py", line 260, in run_fractal_task( File "/path/to/.fractal/fractal-tasks-core0.11.0/venv/lib64/python3.9/site-packages/fractal_tasks_core/tasks/_utils.py", line 79, in run_fractal_task metadata_update = task_function(**pars) File "pydantic/decorator.py", line 40, in pydantic.decorator.validate_arguments.validate.wrapper_function from contextlib import _GeneratorContextManager File "pydantic/decorator.py", line 134, in pydantic.decorator.ValidatedFunction.call File "pydantic/decorator.py", line 206, in pydantic.decorator.ValidatedFunction.execute File "/path/to/.fractal/fractal-tasks-core0.11.0/venv/lib/python3.9/site-packages/fractal_tasks_core/tasks/yokogawa_to_ome_zarr.py", line 173, in yokogawa_to_ome_zarr sample = imread(tmp_images.pop()) KeyError: 'pop from an empty set' ```

Could have been user error with one of the paths, but would not be surprised if the setup doesn't work well and there are e.g. assumptions on the metadata being consistent in there.

We should test whether this is working reasonably for simple cases (e.g. run the multiplexing dataset of example 03, but with the normal create-ome-zarr task). Depending on the outcome, we can consider using the multiple plates as an additional potential filter option in our refactor of the parallelization levels / components approach. Or disable multi-resource inputs.

tcompa commented 10 months ago

The relevant log here seems to be

2023-09-22 13:23:53,976; INFO; [glob_with_multiple_patterns] patterns=['*_B02_*.tif']
2023-09-22 13:23:53,978; INFO; [glob_with_multiple_patterns] Found 0 items

Maybe the B02 glob pattern is only valid for one of the three image folders, and not for the others?


Apart from the specific example, there are for sure some homogeneity constraints. A common issue is that channels may be different across folders, which is not supported (I just tested it again, by using two image folders with a different channel each). In general I would expect these issues to appear during the create-ome-zarr task, rather than yokogawa-to-zarr, but I cannot be sure.

I will try to review these constraints and make them explicit, so that we can take a decision on how to proceed.

Let's keep in mind that as part of #457 we may be moving towards not supporting multiple plates (the issue is not yet well-defined, but I think the main blocker was related to copy-ome-zarr, and not to the image parsing). That's of course for a different case (the case where images for different plates are all in the same folder, rather than in multiple image folders), but somewhat related to this one.