ome / ZarrReader

Other
7 stars 11 forks source link

Memo file generation - performance #63

Closed will-moore closed 3 months ago

will-moore commented 1 year ago

Following update of several historical Plates in IDR to use NGFF (using mkngff), there is often a long delay before the Images can be viewed (e.g. in webclient). In these 3 cases, this never completed (may have run out of memory).

Plates from some other studies are not so bad. E.g. idr0010, idr0025, idr0035.

Seb (https://openmicroscopy.slack.com/archives/C0K5WAD8A/p1693406859045599): "I remembered that you don't need to fully reimport a file to measure what is happening in Bio-Formats. Using omero import -f --debug=DEBUG on the XML file directly in the managed repository or copied under /tmp, I have:

[sbesson@pilot-idr0125-omeroreadwrite ~]$ cp /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-2/2023-05/11/22-57-46.368_converted_mkngff/HT20.ome.zarr/OME/METADATA.ome.xml /tmp/
[sbesson@pilot-idr0125-omeroreadwrite ~]$ /opt/omero/server/OMERO.server/bin/omero import -f /tmp/METADATA.ome.xml --debug=DEBUG
...
2023-08-30 13:58:32,934 882        [      main] INFO                   loci.formats.ImageReader - OMEXMLReader initializing /tmp/METADATA.ome.xml
...
2023-08-30 13:58:33,567 1515       [      main] DEBUG                     loci.formats.Memoizer - start[1693403912882] time[684] tag[loci.formats.Memoizer.setId]

vs

[sbesson@pilot-idr0125-omeroreadwrite ~]$ /opt/omero/server/OMERO.server/bin/omero import -f /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-2/2023-05/11/22-57-46.368_converted_mkngff/HT20.ome.zarr/OME/METADATA.ome.xml --debug=DEBUG | grep setId
...
2023-08-30 13:58:56,385 1153       [      main] INFO                   loci.formats.ImageReader - ZarrReader initializing /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-2/2023-05/11/22-57-46.368_converted_mkngff/HT20.ome.zarr/OME/METADATA.ome.xml
...
2023-08-30 14:10:00,846 665614     [      main] DEBUG                     loci.formats.Memoizer - start[1693403936359] time[664486] tag[loci.formats.Memoizer.setId]

"sounds like a critical performance issue which needs to be addressed before starting a large memo file regeneration of all these upgrade filesets"

joshmoore commented 1 year ago

Not sure if this got communicated elsewhere, @dgault, but this is currently the primary blocker for the IDR/NGFF migration.