ome / ZarrReader

Other
5 stars 8 forks source link

Reorder group keys #53

Closed dgault closed 11 months ago

dgault commented 1 year ago

This is an initial attempt to fix the issues seen in https://github.com/ome/ZarrReader/issues/52 This first commit still requires some further testing but I believe it should help to solve the main issue of the reordered wells.

I would like to make some further follow ups to this initial commit to fully bring the ZarrReader metadata inline with, particularly for the flat option which needs more work to keep the series ordering consistent.

--exclude

will-moore commented 1 year ago

Fix deployed and tested - looks good! See https://github.com/IDR/idr-metadata/issues/652#issuecomment-1529610906

joshmoore commented 1 year ago

:+1: for this as improved behavior over the alphabetical sorting.

Is there still a use for following exactly the the OME-XML ordering, e.g. if present then reorderGroupKeys() uses that as the basis for the re-ordering?

dgault commented 1 year ago

Yeah, that would be the goal to eventually have it match 100%, though at that point this becomes very specifically an intermediate between bf2raw and IDR/OMERO.

will-moore commented 1 year ago

I think I'm seeing an issue with an NGFF Plate that I imported without this PR (and viewed Images + thumbnails etc), then with the updated ZarrReader I'm seeing a ResourceError when trying to view Images.

will-moore commented 1 year ago

Importing a Plate 108-24 from idr0010 with chunks into idr0125-pilot resulted in correct ordering of Wells with this PR.

will-moore commented 1 year ago

Seeing error and failing tests coinciding with this being included in merge build.

2023-05-05 01:17:07,187 DEBUG [                   loci.formats.Memoizer] (l.Server-6) Kryo Exception: Unable to find class: loci.formats.in.ZarrReader$$Lambda$69/1037463541
com.esotericsoftware.kryo.KryoException: Unable to find class: loci.formats.in.ZarrReader$$Lambda$69/1037463541
Caused by: java.lang.ClassNotFoundException: loci.formats.in.ZarrReader$$Lambda$69/1037463541

Excluding for now...

EDIT: with this excluded the build is passing, but with fix below, removing...

will-moore commented 1 year ago

There may be some issues observed on idr0125-pilot where ZarrReader was updated to a version that included this PR: It's possible that previously-imported NGFF Plates have issues when the keys are reordered after import, but it's hard to be sure since that workflow hasn't been tested specifically and lots of other moving parts. Unfortunately I can't find NGFF Plates previously imported into merge-ci server to test with the updated ZarrReader.

I'll exclude this PR temporarily, then I can import some NGFF Plates into merge-ci to check their appearance after upgrade...

and then running https://merge-ci.openmicroscopy.org/jenkins/job/Trigger/ ....

EDIT: removed exclude flag after importing https://merge-ci.openmicroscopy.org/web/webclient/?show=plate-69301

This plate looks fine with this PR merged

joshmoore commented 1 year ago

So it sounds like the new behavior is as intended, and the only remaining issue is how to ... warn/correct/replace the existing datasets that may have been imported with the previous behavior?

will-moore commented 1 year ago

If my test above is correct, then we don't need to replace or correct previously imported Filesets.

I don't quite understand how reordering can fix previously imported Filesets (which appear incorrect because we've swapped the fileset with a pattern file plate) and not also break previously imported Filesets that haven't been swapped.

will-moore commented 1 year ago

Testing fresh NGFF Plate import with this PR fix at https://github.com/IDR/idr-metadata/issues/643#issuecomment-1543728125

will-moore commented 1 year ago

Testing Fileset swap for idr0012 with this PR looks good. Validated by checking all pixel values for the whole study.

However, I'm seeing a different re-ordering behaviour when swapping Fileset for idr0010 pattern file Plates: https://github.com/IDR/idr-metadata/issues/641#issuecomment-1554721800

joshmoore commented 1 year ago

Excluding to allow testing of the canonical path work (#57)

dgault commented 1 year ago

The issue discovered with idr0010 was due to the original reader generating series in the unusual order described in https://github.com/IDR/idr-metadata/issues/641#issuecomment-1554721800

The original reordering was relying on a preset order 0f A1, A2, A3 ... B1, B2, B3 ... This may not be the case for each original format, as such we should try and retain the same series ordering as the original OME-XML.

Latest commit modifies the reordering behaviour to maintain the original series ordering using the below logic:

will-moore commented 1 year ago

EDIT - apologies - previous message was wrong. This is looking good on idr0012 data after mkngff workflow: https://github.com/IDR/idr-metadata/issues/643#issuecomment-1697383363

will-moore commented 11 months ago

This has been excluded for a while, and it looks like the issues are "fixed" in current workflows & testing, so I think we can close this PR - thanks