Open will-moore opened 8 months ago
$ ssh pilot-zarr1-dev
screen -r idr0001
cd /data/idr0003
conda activate bioformats2raw
~/bioformats2raw-0.7.0/bin/bioformats2raw --memo-directory ../memo /uod/idr/filesets/idr0003-breker-plasticity/201301120/Images/DTT/p1/experiment_descriptor.xml p1.ome.zarr
OpenJDK 64-Bit Server VM warning: You have loaded library /tmp/opencv_openpnp3176581939484032263/nu/pattern/opencv/linux/x86_64/libopencv_java342.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.esotericsoftware.reflectasm.AccessClassLoader (file:/home/wmoore/bioformats2raw-0.7.0/lib/reflectasm-1.11.9.jar) to method java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int,java.security.ProtectionDomain)
WARNING: Please consider reporting this to the maintainers of com.esotericsoftware.reflectasm.AccessClassLoader
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Looks like an error, but seems to have worked OK..
(bioformats2raw) [wmoore@pilot-zarr1-dev idr0003]$ find ./ -name .zattrs
...
./p1.ome.zarr/P/24/2/.zattrs
./p1.ome.zarr/P/24/.zattrs
./p1.ome.zarr/.zattrs
(bioformats2raw) [wmoore@pilot-zarr1-dev idr0003]$ find ./ -name .zattrs | wc
1538 1538 43248
$ zip -r p1.ome.zarr.zip p1.ome.zarr
Download (1.4 G) and upload to idr-testing...
$ rsync -rvP pilot-zarr1-dev:/data/idr0003/p1.ome.zarr.zip ./
$ rsync -rvP p1.ome.zarr.zip idr-testing.openmicroscopy.org:/home/wmoore/
$ ssh -A idr-testing.openmicroscopy.org
$ rsync -rvP p1.ome.zarr.zip omeroreadwrite:/home/wmoore/
Import..
$ ssh omeroreadwrite
$ unzip p1.ome.zarr
(venv3) [wmoore@test120-omeroreadwrite ~]$ omero import --depth 20 p1.ome.zarr
2024-03-01 11:55:41,047 889 [ main] INFO ome.formats.importer.ImportConfig - OMERO.blitz Version: 5.7.2
2024-03-01 11:55:41,070 912 [ main] INFO ome.formats.importer.ImportConfig - Bioformats version: 7.1.0 revision: 05c7b2413cfad19a73b619c61ddf77ca2d038ce7 date: 11 December 2023
2024-03-01 11:55:41,391 1233 [ main] INFO formats.importer.cli.CommandLineImporter - Log levels -- Bio-Formats: ERROR OMERO.importer: INFO
2024-03-01 11:55:42,125 1967 [ main] INFO ome.formats.importer.ImportCandidates - Depth: 20 Metadata Level: MINIMUM
2024-03-01 11:55:58,163 18005 [ main] INFO ome.formats.importer.ImportCandidates - 16917 file(s) parsed into 1 group(s) with 1 call(s) to setId in 11022ms. (16037ms total) [0 unknowns]
2024-03-01 11:55:59,202 19044 [ main] INFO ome.formats.OMEROMetadataStoreClient - Attempting initial SSL connection to localhost:4064
2024-03-01 11:56:01,233 21075 [ main] INFO ome.formats.OMEROMetadataStoreClient - Insecure connection requested, falling back
2024-03-01 11:56:02,035 21877 [ main] INFO ome.formats.OMEROMetadataStoreClient - Pinging session every 300s.
2024-03-01 11:56:02,055 21897 [ main] INFO ome.formats.OMEROMetadataStoreClient - Server: 5.6.10
2024-03-01 11:56:02,055 21897 [ main] INFO ome.formats.OMEROMetadataStoreClient - Client: 5.7.2
2024-03-01 11:56:02,055 21897 [ main] INFO ome.formats.OMEROMetadataStoreClient - Java Version: 1.8.0_402
2024-03-01 11:56:02,055 21897 [ main] INFO ome.formats.OMEROMetadataStoreClient - OS Name: Linux
2024-03-01 11:56:02,055 21897 [ main] INFO ome.formats.OMEROMetadataStoreClient - OS Arch: amd64
2024-03-01 11:56:02,055 21897 [ main] INFO ome.formats.OMEROMetadataStoreClient - OS Version: 3.10.0-1160.108.1.el7.x86_64
2024-03-01 11:56:02,604 22446 [2-thread-1] INFO ormats.importer.cli.LoggingImportMonitor - FILESET_UPLOAD_PREPARATION
...
2024-03-02 01:38:53,782 49393624 [3-thread-1] INFO ormats.importer.cli.LoggingImportMonitor - FILE_UPLOAD_COMPLETE: /home/wmoore/p1.ome.zarr/.zattrs
2024-03-02 02:01:42,222 50762064 [2-thread-1] INFO ormats.importer.cli.LoggingImportMonitor - FILESET_UPLOAD_END
2024-03-02 02:01:43,318 50763160 [2-thread-1] INFO ormats.importer.cli.LoggingImportMonitor - IMPORT_STARTED Logfile: 64420961
2024-03-02 02:05:00,964 50960806 [l.Client-0] INFO ormats.importer.cli.LoggingImportMonitor - METADATA_IMPORTED Step: 1 of 5 Logfile: 64420961
2024-03-02 02:08:22,203 51162045 [l.Client-4] INFO ormats.importer.cli.LoggingImportMonitor - PIXELDATA_PROCESSED Step: 2 of 5 Logfile: 64420961
2024-03-02 02:12:29,399 51409241 [l.Client-5] INFO ormats.importer.cli.LoggingImportMonitor - THUMBNAILS_GENERATED Step: 3 of 5 Logfile: 64420961
2024-03-02 02:12:29,689 51409531 [l.Client-6] INFO ormats.importer.cli.LoggingImportMonitor - METADATA_PROCESSED Step: 4 of 5 Logfile: 64420961
2024-03-02 02:12:29,769 51409611 [l.Client-5] INFO ormats.importer.cli.LoggingImportMonitor - OBJECTS_RETURNED Step: 5 of 5 Logfile: 64420961
2024-03-02 02:12:31,373 51411215 [l.Client-6] INFO ormats.importer.cli.LoggingImportMonitor - IMPORT_DONE Imported file: /home/wmoore/p1.ome.zarr/OME/METADATA.ome.xml
Plate:10551
Other imported objects:
Fileset:6317541
==> Summary
16917 files uploaded, 1 fileset, 1 plate created, 1152 images imported, 0 errors in 14:16:28.893
Wow - took 14 hours to import!
After IDR meeting today, 5 of us spend 20 minutes opening many images from that plate without seeing errors and showing good/acceptable performance. cc @francesw @jburel
Downloaded 3 plates.zip from https://www.ebi.ac.uk/biostudies/submissions/files?path=%2Fuser%2Fidr0010
Uploaded to idr-testing:omeroreadwrite, placed in new dir at /data/ngff
, unzipped and owned by omero-server
$ pwd
/data/ngff
$ ls -lh
total 1.4G
drwxrwxr-x. 15 omero-server omero-server 219 Jul 10 2023 101-24.ome.zarr
-rw-r--r--. 1 omero-server wmoore 457M Mar 4 12:22 101-24.ome.zarr.zip
drwxrwxr-x. 15 omero-server omero-server 219 Jul 10 2023 10-34.ome.zarr
-rw-r--r--. 1 omero-server wmoore 455M Mar 4 12:21 10-34.ome.zarr.zip
drwxrwxr-x. 15 omero-server omero-server 219 Jul 10 2023 103.ome.zarr
-rw-r--r--. 1 omero-server wmoore 461M Mar 4 12:22 103.ome.zarr.zip
For plate 10-34
, find location in ManagedRepo from webclient... Can see symlink to s3:
bash-4.2$ ls -lh /data/OMERO/ManagedRepository/demo_2/2016-05/21/00-27-54.591_mkngff/
total 4.0K
lrwxrwxrwx. 1 omero-server omero-server 109 Dec 6 11:35 2726d2ef-2f45-45b6-9d73-68ea1d57c1b6.zarr -> /bia-integrator-data/S-BIAD885/2726d2ef-2f45-45b6-9d73-68ea1d57c1b6/2726d2ef-2f45-45b6-9d73-68ea1d57c1b6.zarr
-rw-r--r--. 1 omero-server omero-server 49 Dec 6 11:35 2726d2ef-2f45-45b6-9d73-68ea1d57c1b6.zarr.bfoptions
Update symlink (as omero-server):
rm /data/OMERO/ManagedRepository/demo_2/2016-05/21/00-27-54.591_mkngff/2726d2ef-2f45-45b6-9d73-68ea1d57c1b6.zarr
ln -s /data/ngff/10-34.ome.zarr /data/OMERO/ManagedRepository/demo_2/2016-05/21/00-27-54.591_mkngff/2726d2ef-2f45-45b6-9d73-68ea1d57c1b6.zarr
Looks good:
$ ls -lh /data/OMERO/ManagedRepository/demo_2/2016-05/21/00-27-54.591_mkngff/
total 4.0K
lrwxrwxrwx. 1 omero-server omero-server 25 Mar 4 12:40 2726d2ef-2f45-45b6-9d73-68ea1d57c1b6.zarr -> /data/ngff/10-34.ome.zarr
-rw-r--r--. 1 omero-server omero-server 49 Dec 6 11:35 2726d2ef-2f45-45b6-9d73-68ea1d57c1b6.zarr.bfoptions
$ ls /data/OMERO/ManagedRepository/demo_2/2016-05/21/00-27-54.591_mkngff/2726d2ef-2f45-45b6-9d73-68ea1d57c1b6.zarr/
A B C D E F G H I J K L OME
Repeating for the other 2 plates downloaded above...
Plate 101-24
:
bash-4.2$ rm /data/OMERO/ManagedRepository/demo_2/2016-05/21/02-06-31.113_mkngff/49150a5d-8fc2-499a-bbc6-4a3eed2d44b1.zarr
bash-4.2$ ln -s /data/ngff/101-24.ome.zarr /data/OMERO/ManagedRepository/demo_2/2016-05/21/02-06-31.113_mkngff/49150a5d-8fc2-499a-bbc6-4a3eed2d44b1.zarr
bash-4.2$ ls /data/OMERO/ManagedRepository/demo_2/2016-05/21/02-06-31.113_mkngff/49150a5d-8fc2-499a-bbc6-4a3eed2d44b1.zarr
A B C D E F G H I J K L OME
Plate 103
:
bash-4.2$ rm /data/OMERO/ManagedRepository/demo_2/2016-05/21/02-26-08.432_mkngff/1fab1705-9561-4689-891d-e039c4ec3076.zarr
bash-4.2$ ln -s /data/ngff/103.ome.zarr /data/OMERO/ManagedRepository/demo_2/2016-05/21/02-26-08.432_mkngff/1fab1705-9561-4689-891d-e039c4ec3076.zarr
bash-4.2$ ls /data/OMERO/ManagedRepository/demo_2/2016-05/21/02-26-08.432_mkngff/1fab1705-9561-4689-891d-e039c4ec3076.zarr
A B C D E F G H I J K L OME
Since /data/ngff
isn't accessible on omeroreadonly
servers, we need a different location, and copy the data to all servers...
E.g.
for server in omeroreadonly-1 omeroreadonly-2 omeroreadonly-3 omeroreadonly-4; do rsync -rvP 101-24.ome.zarr.zip $server:/home/wmoore ; done;
ssh omeroreadonly-1
for z in 101-24.ome.zarr.zip 10-34.ome.zarr.zip 103.ome.zarr.zip; do sudo chown omero-server $z; done
sudo mkdir /ngff && sudo chown -R omero-server /ngff
for z in 101-24.ome.zarr.zip 10-34.ome.zarr.zip 103.ome.zarr.zip; do sudo mv $z /ngff; done
sudo -u omero-server -s
cd /ngff/
for z in 101-24.ome.zarr.zip 10-34.ome.zarr.zip 103.ome.zarr.zip; do unzip $z; done
On omeroreadwrite, move data to /ngff
and update symlinks...
bash-4.2$ rm /data/OMERO/ManagedRepository/demo_2/2016-05/21/00-27-54.591_mkngff/2726d2ef-2f45-45b6-9d73-68ea1d57c1b6.zarr
bash-4.2$ ln -s /ngff/10-34.ome.zarr /data/OMERO/ManagedRepository/demo_2/2016-05/21/00-27-54.591_mkngff/2726d2ef-2f45-45b6-9d73-68ea1d57c1b6.zarr
bash-4.2$ rm /data/OMERO/ManagedRepository/demo_2/2016-05/21/02-06-31.113_mkngff/49150a5d-8fc2-499a-bbc6-4a3eed2d44b1.zarr
bash-4.2$ ln -s /ngff/101-24.ome.zarr /data/OMERO/ManagedRepository/demo_2/2016-05/21/02-06-31.113_mkngff/49150a5d-8fc2-499a-bbc6-4a3eed2d44b1.zarr
bash-4.2$ rm /data/OMERO/ManagedRepository/demo_2/2016-05/21/02-26-08.432_mkngff/1fab1705-9561-4689-891d-e039c4ec3076.zarr
bash-4.2$ ln -s /ngff/103.ome.zarr /data/OMERO/ManagedRepository/demo_2/2016-05/21/02-26-08.432_mkngff/1fab1705-9561-4689-891d-e039c4ec3076.zarr
Looks good - images are viewable under idr-testing.openmicroscopy.org
Compare formats (on disk)
To compare the performance of NGFF data (ZarrReader) with other formats (both on disk), we want to compare NGFF version of the data alongside the same data in it's original format on the same server.
Choose some data to work with: idr0003 is not too big at 2.3G for a plate. Summary: (more details below):
render_image
to load the initial plane. Plot the average of 25 Wells - Times in millisecs: Error bars are 1 std dev.Conclusion: NGFF is no slower (maybe faster)?
Compare disk vv s3
We want to test the performance of loading data from s3 compared with loading the same data from local disk. Use idr0010 data since all plates are identical in terms of size etc:
/ngff
dir on each idr-testing serverConclusion: Data access via S3 is slower than on disk: