ome / ngff

Next-generation file format (NGFF) specifications for storing bioimaging data in the cloud.
https://ngff.openmicroscopy.org
Other
115 stars 38 forks source link

Support zarr v3 #249

Open d-v-b opened 3 months ago

d-v-b commented 3 months ago

removes the zarr 2 specific language from the spec, and adds a section recommending against mixing zarr 2 and zarr 3 hierarchies

github-actions[bot] commented 3 months ago

Automated Review URLs

d-v-b commented 3 months ago

this is a lighter version of https://github.com/ome/ngff/pull/227, without any metadata changes.

will-moore commented 3 months ago

Would a group zarr.json containing multiscales then look like this?

{
  "attributes": {
    "multiscales": [
      {
        "version": "0.5",
        ...
      }
    ]
  },
  "zarr_format": 3,
  "node_type": "group"
}
d-v-b commented 3 months ago

@will-moore yes, everything that was previously in .zattrs would be stored instead under the attributes key in zarr.json. But zarr implementations should make this change invisible, since the basic model of a group has not changed.

joshmoore commented 3 months ago

@d-v-b: would you be up for moving this change from latest to a new copy under 0.5-dev1?

d-v-b commented 3 months ago

@joshmoore I guess that's fine, but then don't I have to manually track changes to latest?

joshmoore commented 3 months ago

I was thinking these would become the fast moving location, e.g. from the meetings yesterday:

The problem I'm concerned about when using latest for everything is that there won't be a way to track the intermediate stages in the (more or less) ephemeral data.

d-v-b commented 3 months ago
* dev1: v2 -> v3

* dev2: RFC-2

* dev3: ro-crate (RFC-4, whatever)

In this scheme, doesn't dev2 depend on dev1, and similarly for dev3 / dev2? Which means separate folders might get clunky. Or is the idea that these are all independent changes?

joshmoore commented 3 months ago

They may depend on each but there may also be roll backs in the changes we need. So it definitely might get clunky, but it will let us move quickly for a period of time.

d-v-b commented 3 months ago

@joshmoore I made the requested changes, the diff is now unreadable, not sure if there is a way to avoid that as long as we are doing copy + paste

joshmoore commented 3 months ago

Thanks, @d-v-b.

not sure if there is a way to avoid that as long as we are doing copy + paste

It's a good question, and one I've wondered about independently of the dev challenge. The only fairly wacky idea I had was to have the version be a branch rather than directory. 🤷🏽

d-v-b commented 3 months ago

Alternatively, the repo only ever has 1 version, the latest one, and old versions are accessible via git history (and as github releases), and proposed versions are branches which get merged via PRs. It seems like the only reason to keep old versions like 0.3 around would be if we are planning on changing them, but I don't think that's the case?

d-v-b commented 3 months ago

cf conventions does a normal github release workflow. seems a bit simpler than keeping old versions around as folders

will-moore commented 3 months ago

I'm using this branch schemas in updated ome-ngff-validator at https://github.com/ome/ome-ngff-validator/pull/35 (with sample data - see link on PR). All looking good. The only trivial issue I had was due to the new directories containing schemas etc are under 0.5-dev1 whereas the version in the schemas is 0.5-dev.

normanrz commented 3 months ago

I like the idea of maintaining the versions on individual branches.

I think this PR is a useful base for #242 but I don't think it is worth having on its own. If we are fine with using non-finalized versions, why not use the current (or next) iteration of RFC-2? Why introduce yet another way for supporting Zarr v3?

Because it came up in the meeting. I definitely think that this is a breaking change, because all 0.4-conforming OME-Zarr impls would suddenly be non-conforming anymore. While loosening restrictions in libraries or applications can be considered non-breaking changes the same logic does not apply to file format specifications.

d-v-b commented 3 months ago

Because it came up in the meeting. I definitely think that this is a breaking change, because all 0.4-conforming OME-Zarr impls would suddenly be non-conforming anymore. While loosening restrictions in libraries or applications can be considered non-breaking changes the same logic does not apply to file format specifications.

Where does the spec state what an implementation must do in order to be considered conformant? I don't think we have ever been strict about this.

joshmoore commented 3 months ago

It's definitely a breaking change as all of the devs will be. As I mentioned by email, in mind, what you're referring to is dev2.

imagesc-bot commented 3 months ago

This pull request has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/ome2024-ngff-challenge/97363/10

d-v-b commented 3 months ago

@will-moore all the versions should be correct, let me know if I missed anything

will-moore commented 3 months ago

@d-v-b Great, thanks. Updated https://github.com/ome/ome-ngff-validator/pull/35 accordingly and looks good 👍

normanrz commented 3 months ago

I mentioned this PR in the RFC-2 revision as an alternative approach.