ome / ZarrReader

Other
7 stars 11 forks source link

Reduce number of fields in memo file #78

Closed dgault closed 9 months ago

dgault commented 9 months ago

PR is to be used for testing purposes to see if the size of the memo file can be reduced.

will-moore commented 9 months ago

Going to test today's ZarrReader build on idr0138-pilot with idr0090 NGFF data. No memo files generated yet - NGFF data was created but not viewed yet.

Update ZarrReader... as omero-server...

wget https://merge-ci.openmicroscopy.org/jenkins/job/BIOFORMATS-build/694/default/artifact/bio-formats-build/ZarrReader/target/OMEZarrReader-0.4.1-SNAPSHOT-jar-with-dependencies.jar
mv OMEZarrReader-0.4.1-SNAPSHOT-jar-with-dependencies.jar OMEZarrReader_p772_b694.jar
 rm OMERO.server/lib/server/OMEZarrReader-0.3.2-SNAPSHOT-jar-with-dependencies.jar
rm: remove write-protected regular file ‘OMERO.server/lib/server/OMEZarrReader-0.3.2-SNAPSHOT-jar-with-dependencies.jar’? y
rm OMERO.server/lib/client/OMEZarrReader-0.3.2-SNAPSHOT-jar-with-dependencies.jar
rm: remove write-protected regular file ‘OMERO.server/lib/client/OMEZarrReader-0.3.2-SNAPSHOT-jar-with-dependencies.jar’? y
cp OMEZarrReader_p772_b694.jar OMERO.server/lib/server/
cp OMEZarrReader_p772_b694.jar OMERO.server/lib/client/

restart..

Then view idr0090 first plate... - http://localhost:1080/webclient/?show=image-12539701

EDIT: First plate isn't a great choice as it's from omero-cli-zarr so won't have any StructuredAnnotations to be excluded from memo file.

14:38 - view image from 2nd plate to start memo generation...

Previous memo file generation time for idr0090 was 111 minutes ...

will-moore commented 9 months ago

On idr-testing, compare sizes of memo from omero-cli-zarr data (Plate 1)... (10M)

[wmoore@test120-omeroreadwrite ~]$ ls -alh /data/OMERO/BioFormatsCache/data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-5/2021-02/18/20-50-17.861_mkngff/a5a2714b-bfbf-4251-95ac-5319fda4bf69.zarr/
total 10M
drwxrwxr-x. 2 omero-server omero-server  29 Dec 16 16:51 .
drwxrwxr-x. 3 omero-server omero-server  55 Dec 15 14:52 ..
-rw-rw-r--. 1 omero-server omero-server 10M Dec 16 16:51 ..zattrs.bfmemo

with from bioformats2raw... (Plate 2 of idr0090)... (85M)

[wmoore@test120-omeroreadwrite ~]$ ls -alh /data/OMERO/BioFormatsCache/data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-4/2021-02/19/04-15-50.380_mkngff/07f2244a-0fae-4a06-b0e8-6bfa586793d0.zarr/OME/.METADATA.ome.xml.bfmemo 
-rw-rw-r--. 1 omero-server omero-server 85M Dec 15 16:30 /data/OMERO/BioFormatsCache/data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-4/2021-02/19/04-15-50.380_mkngff/07f2244a-0fae-4a06-b0e8-6bfa586793d0.zarr/OME/.METADATA.ome.xml.bfmemo

Compare that with idr0138-pilot, using this PR: (Plate 2 of idr0090)... (16M)

(base) [wmoore@pilot-idr0138-omeroreadwrite ~]$ ls -alh /data/OMERO/BioFormatsCache/data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-4/2021-02/19/04-15-50.380_mkngff/07f2244a-0fae-4a06-b0e8-6bfa586793d0.zarr/OME/
total 16M
drwxrwxr-x. 2 omero-server omero-server  38 Feb 12 15:19 .
drwxrwxr-x. 3 omero-server omero-server  17 Feb 12 14:32 ..
-rw-rw-r--. 1 omero-server omero-server 16M Feb 12 15:19 .METADATA.ome.xml.bfmemo

Comparing speeds of rendering images for that plate, it feels a bit faster on idr0138 with this PR than on idr-next. E.g. 1-2 seconds to render an image on idr0138 compared with 4-5 seconds on idr-next (both have quite a bit of variation).

Sampling loadMemo times for idr0090 plate2 on idr-next:

[wmoore@prod120-omeroreadonly-2 ~]$ grep -A 2 -B 2 "07f2244a-0fae-4a06-b0e8" /opt/omero/server/OMERO.server/var/log/Blitz-0.log | grep "loadMemo"
2024-02-12 15:35:10,013 DEBUG [                   loci.formats.Memoizer] (l.Server-0) start[1707752108452] time[1560] tag[loci.formats.Memoizer.loadMemo]
2024-02-12 15:35:12,946 DEBUG [                   loci.formats.Memoizer] (l.Server-1) start[1707752110895] time[2051] tag[loci.formats.Memoizer.loadMemo]
2024-02-12 15:35:15,269 DEBUG [                   loci.formats.Memoizer] (l.Server-9) start[1707752113852] time[1417] tag[loci.formats.Memoizer.loadMemo]
2024-02-12 15:35:29,611 DEBUG [                   loci.formats.Memoizer] (.Server-18) start[1707752128258] time[1353] tag[loci.formats.Memoizer.loadMemo]
2024-02-12 15:35:44,695 DEBUG [                   loci.formats.Memoizer] (.Server-15) start[1707752143249] time[1446] tag[loci.formats.Memoizer.loadMemo]
2024-02-12 15:35:51,152 DEBUG [                   loci.formats.Memoizer] (.Server-16) start[1707752149793] time[1359] tag[loci.formats.Memoizer.loadMemo]
2024-02-12 15:42:36,578 DEBUG [                   loci.formats.Memoizer] (.Server-22) start[1707752555020] time[1558] tag[loci.formats.Memoizer.loadMemo]

and the same on idr0138 (quite a bit faster) 👍

2024-02-12 15:39:00,767 DEBUG [                   loci.formats.Memoizer] (l.Server-5) start[1707752340331] time[436] tag[loci.formats.Memoizer.loadMemo]
2024-02-12 15:39:09,584 DEBUG [                   loci.formats.Memoizer] (l.Server-6) start[1707752349320] time[263] tag[loci.formats.Memoizer.loadMemo]
2024-02-12 15:39:14,092 DEBUG [                   loci.formats.Memoizer] (l.Server-6) start[1707752353825] time[267] tag[loci.formats.Memoizer.loadMemo]
2024-02-12 15:41:55,212 DEBUG [                   loci.formats.Memoizer] (l.Server-0) start[1707752514937] time[275] tag[loci.formats.Memoizer.loadMemo]
2024-02-12 15:42:08,850 DEBUG [                   loci.formats.Memoizer] (l.Server-5) start[1707752528472] time[378] tag[loci.formats.Memoizer.loadMemo]
2024-02-12 15:42:17,913 DEBUG [                   loci.formats.Memoizer] (l.Server-3) start[1707752537490] time[422] tag[loci.formats.Memoizer.loadMemo]
2024-02-12 15:42:22,790 DEBUG [                   loci.formats.Memoizer] (l.Server-3) start[1707752542530] time[260] tag[loci.formats.Memoizer.loadMemo]
2024-02-12 15:44:14,196 DEBUG [                   loci.formats.Memoizer] (l.Server-7) start[1707752653908] time[287] tag[loci.formats.Memoizer.loadMemo]
2024-02-12 15:44:15,373 DEBUG [                   loci.formats.Memoizer] (l.Server-3) start[1707752655078] time[294] tag[loci.formats.Memoizer.loadMemo]
2024-02-12 15:44:15,832 DEBUG [                   loci.formats.Memoizer] (l.Server-8) start[1707752655570] time[261] tag[loci.formats.Memoizer.loadMemo]
2024-02-12 15:44:16,459 DEBUG [                   loci.formats.Memoizer] (l.Server-2) start[1707752656190] time[269] tag[loci.formats.Memoizer.loadMemo]
will-moore commented 9 months ago

Memo file from last week's testing is still on idr0138-pilot, for 2nd Plate of idr0090...

(base) [wmoore@pilot-idr0138-omeroreadwrite ~]$ ls -alh /data/BioFormatsCache/data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-4/2021-02/19/04-15-50.380_mkngff/07f2244a-0fae-4a06-b0e8-6bfa586793d0.zarr/OME/
total 16M
drwxrwxr-x. 2 omero-server omero-server  38 Feb 12 15:19 .
drwxrwxr-x. 3 omero-server omero-server  17 Feb 12 14:32 ..
-rw-rw-r--. 1 omero-server omero-server 16M Feb 12 15:19 .METADATA.ome.xml.bfmemo

but when I view with webclient it gets deleted...

(base) [wmoore@pilot-idr0138-omeroreadwrite ~]$ ls -alh /data/BioFormatsCache/data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-4/2021-02/19/04-15-50.380_mkngff/07f2244a-0fae-4a06-b0e8-6bfa586793d0.zarr/OME/
total 0
drwxrwxr-x. 2 omero-server omero-server  6 Feb 19 14:18 .
drwxrwxr-x. 3 omero-server omero-server 17 Feb 12 14:32 ..
will-moore commented 9 months ago

On idr-testing, progress on idr0090's memo generation:

[wmoore@test120-proxy ~]$ for i in $(cat idr0090_ids.txt); do echo $i && grep ok /tmp/cache/1/$i/* ; done
Image:12544757
Image:12550005
/tmp/cache/1/Image:12550005/stdout:ok: 12550005 14.283611059188843 
Image:12551317
Image:12546037
/tmp/cache/1/Image:12546037/stdout:ok: 12546037 5.984617710113525 
Image:12550677
Image:12553269
/tmp/cache/1/Image:12553269/stdout:ok: 12553269 11.751329183578491 
Image:12554229
/tmp/cache/1/Image:12554229/stdout:ok: 12554229 9.055531740188599 
Image:12547509
Image:12554709
Image:12541269
/tmp/cache/1/Image:12541269/stdout:ok: 12541269 12.111469507217407 
Image:12553749
Image:12545749
/tmp/cache/1/Image:12545749/stdout:ok: 12545749 6.291058540344238 
Image:12549141
/tmp/cache/1/Image:12549141/stdout:ok: 12549141 16.807228803634644 
Image:12552053
Image:12554997
Image:12539701
/tmp/cache/1/Image:12539701/stdout:ok: 12539701 11.105220079421997 
Image:12552789
/tmp/cache/1/Image:12552789/stdout:ok: 12552789 14.844104051589966 
Image:12543765
Image:12548245
Image:12542037
/tmp/cache/1/Image:12542037/stdout:ok: 12542037 15.375039339065552 
Image:12546773
Image:12543029
/tmp/cache/1/Image:12543029/stdout:ok: 12543029 8.997201204299927 
jburel commented 9 months ago

Tested on the pilot. Merging

will-moore commented 9 months ago

Noting down some memo file sizes before this PR so we can compare after upgrade... SPW data only...

[wmoore@test120-omeroreadwrite ~]$ ls -alh /data/OMERO/BioFormatsCache/data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-4/2021-02/19/04-15-50.380_mkngff/07f2244a-0fae-4a06-b0e8-6bfa586793d0.zarr/OME/ -rw-rw-r--. 1 omero-server omero-server 85M Feb 19 16:54 .METADATA.ome.xml.bfmemo

[wmoore@test120-omeroreadwrite ~]$ ls -alh /data/OMERO/BioFormatsCache/data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-4/2021-02/19/09-36-19.689_mkngff/e62717ea-b060-48e5-8cea-7e4b82f009f4.zarr/OME/ total 109M -rw-rw-r--. 1 omero-server omero-server 109M Feb 19 22:47 .METADATA.ome.xml.bfmemo

[wmoore@test120-omeroreadwrite ~]$ ls -alh /data/OMERO/BioFormatsCache/data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-6/2021-02/19/12-14-48.182_mkngff/a666e078-3417-4fa6-a391-c2d056c8c6e2.zarr/ -rw-rw-r--. 1 omero-server omero-server 6.4M Feb 19 22:44 ..zattrs.bfmemo

will-moore commented 9 months ago

idr0004: 162K -> 147K

ls -alh /data/OMERO/BioFormatsCache/data/OMERO/ManagedRepository/demo_2/2015-10/01/07-25-30.185_mkngff/103d9428-b86b-4f4e-84d8-966b5d89aae1.zarr
total 148K
-rw-rw-r--. 1 omero-server omero-server 147K Feb 20 14:00 ..zattrs.bfmemo