Closed jluethi closed 1 year ago
Let's also keep in mind multiplexing scenarios when we refactor things here. i.e. there are multiple images (components) per well, we do need to process them separately sometimes, but will also need to process them together in a single task execution in other scenarios.
Given that there are a few things to consider more deeply here, let's collect them here and come up with the requirements for the metadata, not start a refactor of it right now :)
Redundant given https://github.com/fractal-analytics-platform/fractal-server/issues/802 and discussions on more flexible component handling in https://github.com/fractal-analytics-platform/fractal-server/issues/792
I think we should tackle another round of refactoring the metadata handling & how things are passed between tasks. Specifically, do tasks always need to write metadata? And can tasks start without metadata? We anyway already have a mix of some things saved to metadata, others read from the .zattrs files when needed I think.
Relevant issues: It would be of interest if parallel tasks don't write metadata (and we're anyway not using it), see: https://github.com/fractal-analytics-platform/fractal-server/issues/474#issuecomment-1506621022 Also, it is an open question on how to best start from an existing OME-Zarr file: https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/351
Also, an important principle: Metadata should be something we can get again from the OME-Zarr file: https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/212
Let's differentiate between input to plate-level & parallel tasks:
This to me seems like: Plate-level task should produce some metadata (potentially minimal, e.g. components). They are used in all downstream tasks. But plate-level tasks don't need to take metadata as an input. In that setting, I'm not sure metadata is something that's passed from task to task, but rather something the first task creates and downstream tasks use. Parallel tasks shouldn't write any new components (right?)
Open questions:
Let's keep in mind that we may soon move the MIP projections into the same Zarr file (=> see https://github.com/ome/ngff/issues/187), thus potentially having different components? That could actually make the logic above a bit more complex: What if we don't need the plate-level copy-ome-zarr anymore, because each projection can just be run within the image and produces a new projection. I think we could have the projections as subgroups within the existing components, but we should think through how this interacts with metadata handling and loading specific data.