Open yarikoptic opened 1 year ago
@yarikoptic
- What makes you say that ome-zarr-py PR was merged? It was clearly closed without being merged.
heh, not sure why I thought that "Closed" meant "Merged" to me ;) Left a question on that PR on what is the destiny/plan there in terms of validation.
- It appears that the given validation method loads all(?) of the Zarr data into memory, which will be a problem for arbitrarily large Zarrs.
that would really make it unlikely to be usable by default... where do they do it?
from glancing over https://github.com/ome/ome-zarr-py/pull/142/files#diff-b50d9715cc6e4017cfc055fd0ed73ecb5d9158e17f4d58ca5b3ba08b89c46657R206 I thought it would just validate structure/metadata against some jsonschema.
@yarikoptic ome_zarr.utils.validate()
calls visit()
, which iterates over the return values of Reader.__call__()
, which either descends through the node (I haven't yet found what's populating the "descend" structures) or (line 698) calls ZarrLocation.load()
, which calls out to a third party library that I haven't looked at yet, but the name sure sounds like it's loading data.
I have followed https://github.com/ome/ome-zarr-py/pull/142#issuecomment-1517024760 and
ran check-jsonschema --schemafile /home/dandi/proj/ngff/0.4/schemas/image.schema <(curl --silent "$url")
and got following list of failures http://www.oneukrainian.com/tmp/dandizarrs-jsonschema-checks.out - so the majority of zarrs have
Schema validation errors were encountered.
/dev/fd/63::$.omero.channels[0].window: 'start' is a required property
/dev/fd/63::$.omero.channels[0].window: 'end' is a required property
in fact - there is only 137 zarrs which pass validation and over 4k which do not.
@slaytonmarx could you please check with similar (check-jsonschema --schemafile https://raw.githubusercontent.com/ome/ngff/main/0.4/schemas/image.schema YOUR.zarr/.zattrs
) command on zarr files you have?
I'll check tomorrow morning and let you know!
I received the same validation errors as Yarik:
smarx@leviathan:/mnt/beegfs/Lee/dandi/sub-MITU01/ses-20211001h11m49s01/micr$ check-jsonschema --schemafile https://raw.githubusercontent.com/ome/ngff/main/0.4/schemas/image.schema sub-MITU01_ses-20211001h11m49s01_sample-103_stain-LEC_run-1_chunk-10_SPIM.ome.zarr/.zattrs
Schema validation errors were encountered.
sub-MITU01_ses-20211001h11m49s01_sample-103_stain-LEC_run-1_chunk-10_SPIM.ome.zarr/.zattrs::$.omero.channels[0].window: 'start' is a required property
sub-MITU01_ses-20211001h11m49s01_sample-103_stain-LEC_run-1_chunk-10_SPIM.ome.zarr/.zattrs::$.omero.channels[0].window: 'end' is a required property
ATM I believe we are just testing if we can open them and two custom checks (not an empty group and not too deep of hierarchy). Initial validate support, with --strict option. in ome-zarr-py was recently merged so we should make use of it for our ome .zarrs.