fractal-analytics-platform / fractal-tasks-core

Main tasks for the Fractal analytics platform
https://fractal-analytics-platform.github.io/fractal-tasks-core/
BSD 3-Clause "New" or "Revised" License
12 stars 5 forks source link

`load_NgffImageMeta` should have a more descriptive error when an image doesn't exist #621

Closed jluethi closed 7 months ago

jluethi commented 7 months ago

Currently, when a user tries to load an image that isn't actually available, the error can be quite confusing.

For example, someone tries to load an intensity image or a label image using our load_NgffImageMetafunction:

ngff_image_meta = load_NgffImageMeta(zarrurl)
ngff_label_image_meta = load_NgffImageMeta(zarrurl_label)

Then they get an error like this:

pydantic/decorator.py:40: in pydantic.decorator.validate_arguments.validate.wrapper_function
    ???
pydantic/decorator.py:134: in pydantic.decorator.ValidatedFunction.call
    ???
pydantic/decorator.py:206: in pydantic.decorator.ValidatedFunction.execute
    ???
src/scmultiplex/fractal/scmultiplex_feature_measurements.py:141: in scmultiplex_feature_measurements
    ngff_label_image_meta = load_NgffImageMeta(f"{zarrurl}/labels/{label_image}")
/Users/joel/mambaforge/envs/scmultiplex-fractal/lib/python3.9/site-packages/fractal_tasks_core/lib_ngff.py:391: in load_NgffImageMeta
    zarr_group = zarr.open_group(zarr_path, mode="r")
    if mode in ["r", "r+"]:
        if not contains_group(store, path=path):
            if contains_array(store, path=path):
                raise ContainsArrayError(path)
          raise GroupNotFoundError(path)

E zarr.errors.GroupNotFoundError: group not found at path ''

/Users/joel/mambaforge/envs/scmultiplex-fractal/lib/python3.9/site-packages/zarr/hierarchy.py:1532: GroupNotFoundError

I've had a few support request where people got that error and couldn't make sense of it. We should define a specific error for when a specified channel can't be found (if we don't already have this) and then raise this error with the actual path that was provided (currently those error messages always say zarr.errors.GroupNotFoundError: group not found at path '', which is not helpful).

In the cases above, it would be either zarrurl_label or zarrurl and it would be good to see this in the error message (with their actual content, e.g. a path to a zarr_file)

tcompa commented 7 months ago

then raise this error with the actual path that was provided

Agreed, that message is not very clear. As of https://github.com/fractal-analytics-platform/fractal-tasks-core/pull/622, the message would look like

$ python -c 'from fractal_tasks_core.lib_ngff import load_NgffImageMeta; load_NgffImageMeta("/tmp/something.zarr/missing/path")'
2023-12-05 08:34:37,494; ERROR; 
Original error: group not found at path ''.
Zarr group not found at /tmp/something.zarr/missing/path.
Traceback (most recent call last):
  File "/home/tommaso/Fractal/fractal-tasks-core/fractal_tasks_core/lib_ngff.py", line 403, in load_NgffImageMeta
    zarr_group = zarr.open_group(zarr_path, mode="r")
  File "/home/tommaso/.cache/pypoetry/virtualenvs/fractal-tasks-core-UoMDyr20-py3.10/lib/python3.10/site-packages/zarr/hierarchy.py", line 1532, in open_group
    raise GroupNotFoundError(path)
zarr.errors.GroupNotFoundError: group not found at path ''

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/tommaso/Fractal/fractal-tasks-core/fractal_tasks_core/lib_ngff.py", line 410, in load_NgffImageMeta
    raise ZarrGroupNotFoundError(error_msg)
fractal_tasks_core.lib_ngff.ZarrGroupNotFoundError: 
Original error: group not found at path ''.
Zarr group not found at /tmp/something.zarr/missing/path.

With the last line including the path of the missing group. Is this improvement enough?

It's true that the traceback looks complex, but this is because we have to handle a zarr-python error and re-raise our own error. In order to fully avoid the first zarr-python error, we would need to first check if the zarr group exists, but there is not API for this (that is, I am not aware of anything like zarr.group_exists("/tmp/something.zarr")) and I'd rather not rely on checking the existence of files or folders.


We should define a specific error for when a specified channel can't be found

I would rather stay close to the original low-level error, which has to do with a missing Zarr group and not to anything related to OME-Zarr. Is there any specific reason why we need to mention channels here?

jluethi commented 7 months ago

With the last line including the path of the missing group. Is this improvement enough?

Given how much these errors are still exposed to our users, I'd want to at least reword this a bit. e.g.: "Could not load the requested image, because no Zarr image was found at /tmp/something.zarr/missing/path"

If we can't easily catch that error before, then so be it. I agree with not doing it based on file existence.


The very separate discussion to this is if we could highlight the last line of the error messages more in Fractal web then to have users focus on the message that should state what actually went wrong and how sufficient just seeing that last line would be.