Closed tcompa closed 2 months ago
Just for my understanding: zarr_url is still the full path to the zarr file. But it should always contain the zarr_dir base?
e.g. a valid zarr_url is:
/path/to/zarr_dir/myzarr.zarr/B/03/0
in the zarr_dir:
/path/to/zarr_dir/
Or does this mean something different?
Big picture, let's not put this constraint too broadly. I see it as useful, but there may be a future (especially with OME-Zarr collections) where some zarr files (e.g. labels) live in different folders than the other zarr files.
Just for my understanding: zarr_url is still the full path to the zarr file. But it should always contain the zarr_dir base?
e.g. a valid zarr_url is:
/path/to/zarr_dir/myzarr.zarr/B/03/0
in the zarr_dir:/path/to/zarr_dir/
Yes, this is correct.
More details:
zarr_dir
(absolute-path) input argument, to determine where new images would be created.zarr_dir
argument would not fit very well with portability, and it would also introduce a small overwrite risk (if I export a worfklow and then re-import it, the zarr_dir
value is the same as in the original workflow).zarr_dir
to the dataset attribtues DatasetV2.zarr_dir
.zarr_dir
(e.g. the init tasks of converters and the init task of MIP). Right now, we are passing this argument only to non-parallel tasks, while parallel tasks do not get it. If we want to be even more granular, we would need to add a task attribute such as task.requires_zarr_dir = True/False
.zarr_url(s)
(which must belong to zarr_dir
), then it can still load/write data from any arbitrary path.Big picture, let's not put this constraint too broadly. I see it as useful, but there may be a future (especially with OME-Zarr collections) where some zarr files (e.g. labels) live in different folders than the other zarr files.
I'd say that the scope threshold here is set as in point 7 above - does it sound right?
I'd say that the scope threshold here is set as in point 7 above - does it sound right?
Sounds good. Yes, let's go with it this way then. That scope is narrow enough that we can change it again if in 6-12 months, collections with more complex behaviors come up :)
My main point here:
For what we're currently planning, zarr_dir will always be the base of the zarr_url. And it makes sense to only show the second part in some interfaces (e.g. in the image list): myzarr.zarr/B/03/0
, given that we have a common base_dir.
But we shouldn't have the assumption that entries of the image list share a zarr_dir too deep in our architecture, because that could change with future OME-Zarr versions.
.append
to the image listpath
/zarr_url
attributepath
/zarr_url
attributezarr_dir