Open will-moore opened 1 year ago
I have reverted the removal of .pattern
from image names above.
Just looking at how to handle various image names... I previously thought that names containing whitespace were causing errors, but it seems this is not always the case, since this works OK...
omero zarr export Image:5025553 --name_by name
Using session for public@idr.openmicroscopy.org:4064. Idle timeout: 10 min. Current group: Public
Exporting to Tonsil 3.ome.zarr (0.4)
...
But exporting an Image with a more complex name, e.g. JL_120731_S6A [Well A-1; Field #1]
https://idr.openmicroscopy.org/webclient/?show=image-1229801 fails, with an exception that can be reproduced as follows:
from zarr.storage import FSStore
from zarr.hierarchy import open_group
open_group(FSStore("JL_120731_S6A [Well A-1; Field #1]", mode="w"))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/zarr/hierarchy.py", line 1465, in open_group
return Group(store, read_only=read_only, cache_attrs=cache_attrs,
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/zarr/hierarchy.py", line 164, in __init__
meta_bytes = store[mkey]
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/zarr/storage.py", line 1393, in __getitem__
return self.map[key]
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/fsspec/mapping.py", line 143, in __getitem__
result = self.fs.cat(k)
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/fsspec/spec.py", line 826, in cat
paths = self.expand_path(path, recursive=recursive)
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/fsspec/spec.py", line 1005, in expand_path
out = self.expand_path([path], recursive, maxdepth)
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/fsspec/spec.py", line 1011, in expand_path
bit = set(self.glob(p))
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/fsspec/implementations/local.py", line 70, in glob
return super().glob(path, **kwargs)
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/fsspec/spec.py", line 591, in glob
pattern = re.compile(pattern.replace("=PLACEHOLDER=", ".*"))
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/re.py", line 252, in compile
return _compile(pattern, flags)
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/re.py", line 304, in _compile
p = sre_compile.compile(pattern, flags)
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/sre_compile.py", line 788, in compile
p = sre_parse.parse(p, flags)
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/sre_parse.py", line 955, in parse
p = _parse_sub(source, state, flags & SRE_FLAG_VERBOSE, 0)
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/sre_parse.py", line 444, in _parse_sub
itemsappend(_parse(source, state, verbose, nested + 1,
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/sre_parse.py", line 599, in _parse
raise source.error(msg, len(this) + 1 + len(that))
re.error: bad character range A-1 at position 58
I don't know a good way to make all names "safe" from these types of errors.
Another example that fails with a different Error has Image name plate1_1_013 [Well 1, Field 1 (Spot 1)]
$ omero zarr export Image:179693 --name_by name
Using session for public@idr.openmicroscopy.org:4064. Idle timeout: 10 min. Current group: Public
Exporting to plate1_1_013 [Well 1, Field 1 (Spot 1)].ome.zarr (0.4)
Traceback (most recent call last):
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/fsspec/mapping.py", line 143, in __getitem__
result = self.fs.cat(k)
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/fsspec/spec.py", line 826, in cat
paths = self.expand_path(path, recursive=recursive)
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/fsspec/spec.py", line 1005, in expand_path
out = self.expand_path([path], recursive, maxdepth)
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/fsspec/spec.py", line 1031, in expand_path
raise FileNotFoundError(path)
FileNotFoundError: ['/Users/wmoore/Desktop/ZARR/data/TEMO/plate1_1_013 [Well 1, Field 1 (Spot 1)].ome.zarr/.zgroup']
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/zarr/storage.py", line 1393, in __getitem__
return self.map[key]
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/fsspec/mapping.py", line 147, in __getitem__
raise KeyError(key)
KeyError: '.zgroup'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/zarr/hierarchy.py", line 164, in __init__
meta_bytes = store[mkey]
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/zarr/storage.py", line 1395, in __getitem__
raise KeyError(key) from e
KeyError: '.zgroup'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/bin/omero", line 8, in <module>
sys.exit(main())
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/omero/main.py", line 125, in main
rv = omero.cli.argv()
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/omero/cli.py", line 1784, in argv
cli.invoke(args[1:])
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/omero/cli.py", line 1222, in invoke
stop = self.onecmd(line, previous_args)
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/omero/cli.py", line 1299, in onecmd
self.execute(line, previous_args)
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/omero/cli.py", line 1381, in execute
args.func(args)
File "/Users/wmoore/Desktop/ZARR/omero-cli-zarr/src/omero_zarr/cli.py", line 125, in _wrapper
return func(self, *args, **kwargs)
File "/Users/wmoore/Desktop/ZARR/omero-cli-zarr/src/omero_zarr/cli.py", line 342, in export
image_to_zarr(image, args)
File "/Users/wmoore/Desktop/ZARR/omero-cli-zarr/src/omero_zarr/raw_pixels.py", line 56, in image_to_zarr
root = open_group(store)
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/zarr/hierarchy.py", line 1465, in open_group
return Group(store, read_only=read_only, cache_attrs=cache_attrs,
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/zarr/hierarchy.py", line 167, in __init__
raise GroupNotFoundError(path)
zarr.errors.GroupNotFoundError: group not found at path ''
Same error with:
$ omero zarr export Image:3414011 --name_by name
Using session for public@idr.openmicroscopy.org:4064. Idle timeout: 10 min. Current group: Public
Exporting to 10percent-Wt1-GFP-spheroid-MV.czi [0].ome.zarr (0.4)
...
zarr.errors.GroupNotFoundError: group not found at path ''
and
$ omero zarr export Image:9022301 --name_by name
Exporting to subpool-1_run-1_EXP-19-BQ3550 [Pos101].ome.zarr (0.4)
...
zarr.errors.GroupNotFoundError: group not found at path ''
But, removing the [
character from this name fixed the error, so it looks like it's being recognised as a regex, which is causing the errors above, particularly the re.error: bad character range A-1 at position 58
?!
Other examples that work OK:
$ omero zarr export Image:1884807 --name_by name
Exporting to Centrin_PCNT_Cep215_20110506_Fri-1545_0_SIR_PRJ.dv.ome.zarr (0.4)
...
$ omero zarr export Image:4995043 --name_by name
Exporting to ExperimentB_No05_DMSO_11_10min__010.czi.ome.zarr (0.4)
...
So it looks like all names are OK except for those with [ and ]
in them.
Not sure how to avoid those being recognised as broken regex without actually changing the name we want to write?
Replacing []
with ()
in names now.
👍 Looks good to me. I've used the build from this branch a few times already for the NGFF conversion/export work.
@joshmoore I reduced duplication by creating def get_zarr_name(obj, args)
I also noticed that we need the name for polygon/masks export, but supporting the --name_by name
argument there could be quite a bit of work, so probably not worth it until we know it's needed. I fixed the .zarr
-> .ome.zarr
name at least and updated README
Anything else needed here?
Nothing outstanding from my side.
This PR adds an optional
name_by
argument with options ofid
(default behaviour) andname
. It is needed for batch exporting many Images or Plates where we want the exported OME-Zarr image to have a useful name.This PR has been used for all the
omero-cli-zarr
exports for ongoing IDR NGFF upgrade work:When exporting from OMERO, we now adopt the naming convention of
ID.ome.zarr
orPlateName.ome.zarr
instead of the previousID.zarr
.If names contains square brackets
[ ]
then this can break writing to zarr (see errors below) so these are replaced by( )
.To test: