fractal-analytics-platform / tinkering-flexibility-and-task-api

Thinkering about image subsets/collections and tasks API
0 stars 0 forks source link

Task API #1

Closed tcompa closed 6 months ago

tcompa commented 7 months ago

Based on our discussion today with @jluethi, here are some open questions about task API, starting from this kind of preliminary example:

def task_function_new(
    *,
    zarr_path: str,
    metadata: dict[str, Any],
    output_path: Optional[str] = None,
    T_index: Optional[int] = None,
    C_index: Optional[int] = None,
    Z_index: Optional[int] = None,
):
    print(
        "Now running task with:\n"
        f"  {zarr_path=}\n"
        f"  {output_path=}\n"
        f"  {T_index=}\n"
        f"  {C_index=}\n"
        f"  {Z_index=}\n"
    )
  1. How does a well-level task look like? -> maybe just get a list of images, and proceed from there
  2. How do we write a task that acts on multiple zarrs (e.g. registering cycle N against cycle 0)? -> this could be a custom init task, which takes the list of all valid images (e.g. illumination-corrected) and then creates new parallelization list pairing 0-1, 0-2, 0-3, ..
  3. How do we handle output_zarr_path?
    • Can we write the MIP of a 3D zarr into a different zarr? How?
    • Can we set overwrite_input=False in illumination correction? How?
  4. How do we write a 3D->2D->3D workflow?
  5. Should we address channels by name or index?
tcompa commented 6 months ago

Three defaults parallelization modalities:

  1. Non-parallel
  2. Fully parallel: runs one task per image (in the filtered list)
  3. Combined component: provides the full filtered list as a single argument to the single task (typical use case: for "init" tasks, like copy-ome-zarr)

Non-default use cases need custom init tasks:

  1. Find the filtered list, and split it into N subgroups (use cases: register all images against a reference one, as part of or calculate-registration-init-task)
  2. (maybe same as 4) Parallelize over TCZ/ROIs, after reading the actual zarr (use case: this goes through a custom init task)
tcompa commented 6 months ago
# CURRENT
[
"plate.zarr/A/01/0",
"plate.zarr/A/02/0",
]

# CURRENT, DIFFERENT
[
dict(component = "plate.zarr/A/01/0"),
dict(component = "plate.zarr/A/02/0"),
]

# FUTURE
[
dict(component = "plate.zarr/A/01/0", T_slice=1),
dict(component = "plate.zarr/A/02/0", T_slice=1),
dict(component = "plate.zarr/A/01/0", T_slice=2),
dict(component = "plate.zarr/A/02/0", T_slice=2),
]