NGFF Plate import - slow and OOM due to many files

will-moore commented 2 years ago

Tried downloading from https://cellpainting-gallery.s3.amazonaws.com/cpg0004-lincs/broad/images/2016_04_01_a549_48hr_batch1/images_zarr/SQ00015098__2016-06-08T18_43_42-Measurement1.ome.zarr/ and importing into idr0125-pilot server:

cd /nfs/bioimage/drop/idr0125-way-cellpainting/s3-20220725/
omero import --transfer=ln_s --skip=all --depth=100 SQ00015098__2016-06-08T18_43_42-Measurement1.ome.zarr --file /tmp/idr0125.log  --errs /tmp/idr0125.err

Summary of import logs:

2022-08-25 14:52:27 - started
2022-08-25 16:07:24 - 166675 file(s) parsed (1 hour 15 mins)
2022-08-25 16:07:26 - FILESET_UPLOAD_PREPARATION (1 hour 15 mins)
2022-08-31 09:10:55 - FILE_UPLOAD_COMPLETE (5 days 18 hours)
2022-09-01 13:47:03 - FILESET_UPLOAD_END (~7 days)
2022-09-01 13:47:13 - Error on import java.lang.OutOfMemoryError: Java heap space (~7 days)

Start...

2022-08-25 14:52:27,072 365        [      main] INFO          ome.formats.importer.ImportConfig - OMERO.blitz Version: 5.5.12
2022-08-25 14:52:27,094 387        [      main] INFO          ome.formats.importer.ImportConfig - Bioformats version: 6.10.0 revision: f8b46c2458c43cffdf5bc67cc4bf9dfc6e93167b date: 31 May 2022
2022-08-25 14:52:27,155 448        [      main] INFO   formats.importer.cli.CommandLineImporter - Setting checksum algorithm to File-Size-64
2022-08-25 14:52:27,157 450        [      main] INFO   formats.importer.cli.CommandLineImporter - Skipping thumbnails creation
2022-08-25 14:52:27,157 450        [      main] INFO   formats.importer.cli.CommandLineImporter - Skipping minimum/maximum computation
2022-08-25 14:52:27,157 450        [      main] INFO   formats.importer.cli.CommandLineImporter - Disabling upgrade check
2022-08-25 14:52:27,157 450        [      main] INFO   formats.importer.cli.CommandLineImporter - Setting transfer to ln_s
2022-08-25 14:52:27,160 453        [      main] INFO   formats.importer.cli.CommandLineImporter - Log levels -- Bio-Formats: ERROR OMERO.importer: INFO
2022-08-25 14:52:27,593 886        [      main] INFO      ome.formats.importer.ImportCandidates - Depth: 100 Metadata Level: MINIMUM
2022-08-25 16:07:24,886 4498179    [      main] INFO      ome.formats.importer.ImportCandidates - 166675 file(s) parsed into 1 group(s) with 1 call(s) to setId in 1520633ms. (4497291ms total) [0 unknowns]
2022-08-25 16:07:24,963 4498256    [      main] INFO       ome.formats.OMEROMetadataStoreClient - Attempting initial SSL connection to localhost:4064
2022-08-25 16:07:25,483 4498776    [      main] INFO       ome.formats.OMEROMetadataStoreClient - Insecure connection requested, falling back
2022-08-25 16:07:26,038 4499331    [      main] INFO       ome.formats.OMEROMetadataStoreClient - Pinging session every 300s.
2022-08-25 16:07:26,049 4499342    [      main] INFO       ome.formats.OMEROMetadataStoreClient - Server: 5.6.5
2022-08-25 16:07:26,049 4499342    [      main] INFO       ome.formats.OMEROMetadataStoreClient - Client: 5.5.12
2022-08-25 16:07:26,049 4499342    [      main] INFO       ome.formats.OMEROMetadataStoreClient - Java Version: 11.0.13
2022-08-25 16:07:26,049 4499342    [      main] INFO       ome.formats.OMEROMetadataStoreClient - OS Name: Linux
2022-08-25 16:07:26,049 4499342    [      main] INFO       ome.formats.OMEROMetadataStoreClient - OS Arch: amd64
2022-08-25 16:07:26,049 4499342    [      main] INFO       ome.formats.OMEROMetadataStoreClient - OS Version: 3.10.0-1160.45.1.el7.x86_64
2022-08-25 16:07:26,195 4499488    [2-thread-1] INFO   ormats.importer.cli.LoggingImportMonitor - FILESET_UPLOAD_PREPARATION

Seb: "Looking at the import logs, the issue is that the time to save OriginalFile dramatically increases from a few milliseconds at the beginning of the import"

$ head -n 30 /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-3/2022-08/25/16-07-29.318.log 
2022-08-27 16:00:43,728 INFO  [                 org.perf4j.TimingLogger] (l.Server-3) start[1661616043680] time[48] tag[omero.import.process.init]
2022-08-27 16:00:46,434 INFO  [         ome.security.basic.EventHandler] (l.Server-3)  Auth:    user=2,group=3,event=306438177(User),sess=0cb0a418-f37f-451d-8897-790578dce2cb
2022-08-27 16:00:46,462 INFO  [       ome.security.basic.CurrentDetails] (l.Server-3) Adding log:INSERT,class ome.model.jobs.UploadJob,21503315
2022-08-27 16:00:46,464 INFO  [       ome.security.basic.CurrentDetails] (l.Server-3) Adding log:INSERT,class ome.model.jobs.JobOriginalFileLink,5303765
2022-08-27 16:00:46,721 INFO  [       ome.security.basic.CurrentDetails] (l.Server-3) Adding log:INSERT,class ome.model.jobs.MetadataImportJob,21503316
2022-08-27 16:00:46,781 INFO  [       ome.security.basic.CurrentDetails] (l.Server-3) Adding log:INSERT,class ome.model.jobs.PixelDataJob,21503317
2022-08-27 16:00:46,793 INFO  [       ome.security.basic.CurrentDetails] (l.Server-3) Adding log:INSERT,class ome.model.jobs.IndexingJob,21503318
2022-08-27 16:00:46,910 INFO  [                 org.perf4j.TimingLogger] (l.Server-3) start[1661616046909] time[0] tag[omero.repo.internal_register.find_repo_files]
2022-08-27 16:00:46,922 INFO  [       ome.security.basic.CurrentDetails] (l.Server-3) Adding log:INSERT,class ome.model.core.OriginalFile,41995189
2022-08-27 16:00:46,922 INFO  [       ome.security.basic.CurrentDetails] (l.Server-3) Adding log:INSERT,class ome.model.core.OriginalFile,41995190
2022-08-27 16:00:46,923 INFO  [       ome.security.basic.CurrentDetails] (l.Server-3) Adding log:INSERT,class ome.model.core.OriginalFile,41995191

to ~5s / file currently

$ tail -n 20 /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-3/2022-08/25/16-07-29.318.log 
2022-08-30 08:10:54,960 INFO  [       ome.security.basic.CurrentDetails] (l.Server-3) Adding log:INSERT,class ome.model.core.OriginalFile,42159408
2022-08-30 08:10:54,960 INFO  [       ome.security.basic.CurrentDetails] (l.Server-3) Adding log:INSERT,class ome.model.core.OriginalFile,42159409
2022-08-30 08:10:54,961 INFO  [       ome.security.basic.CurrentDetails] (l.Server-3) Adding log:INSERT,class ome.model.core.OriginalFile,42159410
2022-08-30 08:10:55,610 INFO  [                 org.perf4j.TimingLogger] (l.Server-3) start[1661847049992] time[5618] tag[omero.repo.create_original_file.save]
2022-08-30 08:10:55,610 INFO  [                 org.perf4j.TimingLogger] (l.Server-3) start[1661847055610] time[0] tag[omero.repo.create_original_file.internal_mkdir]
2022-08-30 08:10:55,610 INFO  [                 org.perf4j.TimingLogger] (l.Server-3) start[1661847055610] time[0] tag[omero.repo.internal_register.load]
2022-08-30 08:10:55,611 INFO  [                 org.perf4j.TimingLogger] (l.Server-3) start[1661847055610] time[0] tag[omero.repo.internal_register.find_repo_files]
2022-08-30 08:11:03,043 INFO  [       ome.security.basic.CurrentDetails] (l.Server-3) Adding log:INSERT,class ome.model.core.OriginalFile,42159411
2022-08-30 08:11:03,044 INFO  [       ome.security.basic.CurrentDetails] (l.Server-3) Adding log:INSERT,class ome.model.core.OriginalFile,42159412
2022-08-30 08:11:03,725 INFO  [                 org.perf4j.TimingLogger] (l.Server-3) start[1661847058123] time[5601] tag[omero.repo.create_original_file.save]
2022-08-30 08:11:03,725 INFO  [                 org.perf4j.TimingLogger] (l.Server-3) start[1661847063725] time[0] tag[omero.repo.create_original_file.internal_mkdir]
2022-08-30 08:11:03,725 INFO  [                 org.perf4j.TimingLogger] (l.Server-3) start[1661847063725] time[0] tag[omero.repo.internal_register.load]
2022-08-30 08:11:03,725 INFO  [                 org.perf4j.TimingLogger] (l.Server-3) start[1661847063725] time[0] tag[omero.repo.internal_register.find_repo_files]
2022-08-30 08:11:11,013 INFO  [       ome.security.basic.CurrentDetails] (l.Server-3) Adding log:INSERT,class ome.model.core.OriginalFile,42159413
2022-08-30 08:11:11,014 INFO  [       ome.security.basic.CurrentDetails] (l.Server-3) Adding log:INSERT,class ome.model.core.OriginalFile,42159414
2022-08-30 08:11:11,014 INFO  [       ome.security.basic.CurrentDetails] (l.Server-3) Adding log:INSERT,class ome.model.core.OriginalFile,42159415

Finally...

2022-08-31 09:10:55,720 497909013  [3-thread-1] INFO   ormats.importer.cli.LoggingImportMonitor - FILE_UPLOAD_COMPLETE: /nfs/bioimage/drop/idr0125-way-cellpainting/s3-20220725/SQ00015098__2016-06-08T18_43_42-Measurement1.ome.zarr/P/9/.zgroup
2022-09-01 13:47:03,773 600877066  [2-thread-1] INFO   ormats.importer.cli.LoggingImportMonitor - FILESET_UPLOAD_END
2022-09-01 13:47:13,342 600886635  [2-thread-1] ERROR        ome.formats.importer.ImportLibrary - Error on import
java.lang.OutOfMemoryError: Java heap space
        at omero.model.PermissionsI$1.create(PermissionsI.java:38)
        at IceInternal.BasicStream$EncapsDecoder.newInstance(BasicStream.java:3044)
        at IceInternal.BasicStream$EncapsDecoder11.readInstance(BasicStream.java:4013)
        at IceInternal.BasicStream$EncapsDecoder11.readObject(BasicStream.java:3582)
        at IceInternal.BasicStream.readObject(BasicStream.java:2262)
        at omero.sys.EventContext.__readImpl(EventContext.java:217)
        at Ice.ObjectImpl.__read(ObjectImpl.java:368)
        at IceInternal.BasicStream$EncapsDecoder.unmarshal(BasicStream.java:3138)
        at IceInternal.BasicStream$EncapsDecoder11.readInstance(BasicStream.java:4070)
        at IceInternal.BasicStream$EncapsDecoder11.readObject(BasicStream.java:3582)
        at IceInternal.BasicStream.readObject(BasicStream.java:2262)
        at omero.model.Details.__readImpl(Details.java:525)
        at Ice.ObjectImpl.__read(ObjectImpl.java:368)
        at IceInternal.BasicStream$EncapsDecoder.unmarshal(BasicStream.java:3138)
        at IceInternal.BasicStream$EncapsDecoder11.readInstance(BasicStream.java:4070)
        at IceInternal.BasicStream$EncapsDecoder11.readObject(BasicStream.java:3582)
        at IceInternal.BasicStream.readObject(BasicStream.java:2262)
        at omero.model.IObject.__readImpl(IObject.java:521)
        at omero.model.FilesetEntry.__readImpl(FilesetEntry.java:456)
        at Ice.ObjectImpl.__read(ObjectImpl.java:368)
        at IceInternal.BasicStream$EncapsDecoder.unmarshal(BasicStream.java:3138)
        at IceInternal.BasicStream$EncapsDecoder11.readInstance(BasicStream.java:4070)
        at IceInternal.BasicStream$EncapsDecoder11.readObject(BasicStream.java:3582)
        at IceInternal.BasicStream.readObject(BasicStream.java:2262)
        at omero.model.FilesetUsedFilesSeqHelper.read(FilesetUsedFilesSeqHelper.java:52)
        at omero.model.Fileset.__readImpl(Fileset.java:1678)
        at Ice.ObjectImpl.__read(ObjectImpl.java:368)
        at IceInternal.BasicStream$EncapsDecoder.unmarshal(BasicStream.java:3138)
        at IceInternal.BasicStream$EncapsDecoder11.readInstance(BasicStream.java:4070)
        at IceInternal.BasicStream$EncapsDecoder11.readObject(BasicStream.java:3582)
        at IceInternal.BasicStream.readObject(BasicStream.java:2262)
        at omero.model.FilesetJobLink.__readImpl(FilesetJobLink.java:419)
2022-09-01 13:47:13,344 600886637  [2-thread-1] INFO         ome.formats.importer.ImportLibrary - Exiting on error

will-moore commented 2 years ago

Using an alternative approach as suggested by @sbesson and @joshmoore, I tried importing a "metadata only" NGFF plate (where all the binary chunks were missing but all other files were present. This took about 3 hours. Then I sync'd the chunks from s3 "in place" and also manually created symlinks in the OMERO Managed repo (since the data was imported "in-place" but OMERO only creates symlinks at import time for the files it know about. Instead of creating symlinks for each chunk, I created links at the 'row' level (probably could have done this with a single symlink at the plate level?):

e.g.

$ cd /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-8/2022-09/28/09-25-53.378/SQ00015098__2016-06-08T18_43_42-Measurement1.ome.zarr/

$ rm -rf A/
$ ln -s /uod/idr/filesets/idr0125-way-cellpainting/s3-20220927/SQ00015098__2016-06-08T18_43_42-Measurement1.ome.zarr/A A

This allowed me to view images in webclient, set rendering settings and regenerate thumbnails.

However, this process is not suitable for production, so it would be useful to emulate these steps as an option in Bio-Formats and OMERO import options.

e.g. $ omero import --metadata_only --transfer=ln_s --skip=all --depth=100 /path/to/SQ00015098__2016-06-08T18_43_42-Measurement1.ome.zarr/

joshmoore commented 2 years ago

Thanks for the validation, @will-moore!

will-moore commented 1 year ago

As discussed on IDR call today... We are now converting various submissions with custom formats (mostly .pattern files) into NGFF data. We are facing a similar issue to that reported above for idr0125 - too many chunk files are causing import of idr0013 and idr0015 See https://github.com/IDR/idr-metadata/issues/644 and https://github.com/IDR/idr-metadata/issues/645

This current workaround is to create a 'metadata-only' copy of the Plate (no chunks), in-place import it and then replace the Plate in ManagedRepo with a symlink to the original plate (with chunks). This is kinda painful, so it would be nice to be able to directly import the original Plate, but for OMERO to NOT create an OriginalFile for every chunk (ignore chunks until we actually want to load pixel data).

cc @jburel

will-moore commented 1 year ago

This should be fixed by https://github.com/ome/ZarrReader/pull/41

ome / ZarrReader

NGFF Plate import - slow and OOM due to many files #40