IDR / omero-mkngff

Plugin to swap OMERO filesets with NGFF
GNU General Public License v2.0
0 stars 2 forks source link

Add --symlink_repo option for creating symlinks from sql command #4

Closed will-moore closed 11 months ago

will-moore commented 1 year ago

Fixes #3.

Usage:

omero mkngff sql --symlink_repo data/OMERO/ManagedRepository 601 /path/to/fileset.zarr

I looked at trying to use the OMERO API to lookup the omero.data.dir but for that you need to be logged-in as root, which limits usefulness.

>>> cs = conn.c.sf.getConfigService()
>>> cs.getConfigValues("omero.data.dir")
{}
>>> cs.getConfigDefaults()["omero.data.dir"]
serverExceptionClass = ome.conditions.SecurityViolation
    message = No matching roles found in [Public, user] for session fabc9b94-c6cd-4799-863b-fb1a8f5785cc (allowed: [system])
will-moore commented 1 year ago

Testing on idr0138-pilot... Need to work as omero-server user, so we have correct permissions to create symlinks...

sudo -u omero-server -s

# create `mkngff` conda env etc...
pip install 'omero-mkngff @ git+https://github.com/will-moore/omero-mkngff@symlinks'
$ omero mkngff sql --secret=$SECRET --symlink_repo=/data/OMERO/ManagedRepository 1591302 "/idr0054/zarr/Tonsil 2.ome.zarr/" > idr0054_2.sql
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/Blitz-0-Ice.ThreadPool.Server-10/2019-03/15 // 15-28-44.081 for fileset 1591302
Using args.symlink_repo: /data/OMERO/ManagedRepository
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-10/2019-03/15/15-28-44.081
Creating dir at /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-10/2019-03/15/15-28-44.081_converted/idr0054/zarr symlink_container: idr0054/zarr
Creating symlink /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-10/2019-03/15/15-28-44.081_converted/idr0054/zarr/Tonsil 2.ome.zarr -> /idr0054/zarr/Tonsil 2.ome.zarr/
$ psql -U omero -d idr -h 192.168.10.231 -f setup.sql
CREATE FUNCTION
(mkngff) bash-4.2$ psql -U omero -d idr -h 192.168.10.231 -f idr0054_2.sql 
BEGIN
 mkngff_fileset 
----------------
        5811533
(1 row)

COMMIT

Rendering that Image gives:

    serverExceptionClass = ome.conditions.ResourceError
    message = Error instantiating pixel buffer: /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-10/2019-03/15/15-28-44.081_converted/idr0054/zarr/Tonsil 2.ome.zarr/OME/METADATA.ome.xml

which is to be expected, but at least that path (symlink) has been correctly created above...

ls "/data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-10/2019-03/15/15-28-44.081_converted/idr0054/zarr/Tonsil 2.ome.zarr/"
0  OME
will-moore commented 1 year ago

With those fixes, tested on `Tonsil 3``` image...

$ omero mkngff sql --secret=$SECRET --symlink_repo=/data/OMERO/ManagedRepository 1591303 "/idr0054/zarr/Tonsil 3.ome.zarr/" > idr0054_3.sql
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/Blitz-0-Ice.ThreadPool.Server-7/2019-03/15 // 15-28-58.030 for fileset 1591303
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-7/2019-03/15/15-28-58.030
Creating dir at /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-7/2019-03/15/15-28-58.030_converted/idr0054/zarr
Creating symlink /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-7/2019-03/15/15-28-58.030_converted/idr0054/zarr/Tonsil 3.ome.zarr -> /idr0054/zarr/Tonsil 3.ome.zarr/

$ psql -U omero -d idr -h 192.168.10.231 -f idr0054_3.sql 
BEGIN
 mkngff_fileset 
----------------
        5811535
(1 row)

COMMIT

This works!! Image renders without errors! 👍

will-moore commented 1 year ago

Testing on a Plate....

Find Fileset ID for idr0012 HT01 plate...

$ psql -U omero -d idr -h 192.168.10.231
idr=> select fileset from Image where id=14058307;
 fileset 
---------
 5808582
(1 row)
$ omero mkngff sql --secret=$SECRET --symlink_repo=/data/OMERO/ManagedRepository 5808582 "/idr0012/ngff/HT01.ome.zarr/" > idr0012_HT01.sql
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/Blitz-0-Ice.ThreadPool.Server-3/2023-05/03 // 12-15-10.880 for fileset 5808582
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-3/2023-05/03/12-15-10.880
Creating dir at /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-3/2023-05/03/12-15-10.880_converted/idr0012/ngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-3/2023-05/03/12-15-10.880_converted/idr0012/ngff/HT01.ome.zarr -> /idr0012/ngff/HT01.ome.zarr/

$ psql -U omero -d idr -h 192.168.10.231 -f idr0012_HT01.sql 
BEGIN
 mkngff_fileset 
----------------
        5811537
(1 row)
COMMIT

Fileset now has 45000 files. Rendering image taking a very long time....

UPDATE...

Came back a while later and managed to get images to render and to set rendering settings for the Plate. However, this looks different than on IDR itself as the Well positions are still not correct:

Screenshot 2023-08-17 at 15 05 18

https://github.com/ome/ZarrReader/pull/53

EDIT: I think this issue was due to running mkngff on a Fileset created by Importing OME-NGFF. Issue looks fixed at https://github.com/IDR/idr-metadata/issues/643#issuecomment-1697383363

will-moore commented 1 year ago

Try with idr0036 plate... named 20585

idr=> select fileset from Image where id=1895787;
 fileset 
---------
   20253
$ omero mkngff sql --secret=$SECRET --symlink_repo=/data/OMERO/ManagedRepository 20253 "/idr0036/zarr/20585.zarr/" > idr0036_20585.sql

Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/2016-05/19 // 00-15-38.492 for fileset 20253
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/19/00-15-38.492
Creating dir at /data/OMERO/ManagedRepository/demo_2/2016-05/19/00-15-38.492_converted/idr0036/zarr
Creating symlink /data/OMERO/ManagedRepository/demo_2/2016-05/19/00-15-38.492_converted/idr0036/zarr/20585.zarr -> /idr0036/zarr/20585.zarr/

But something is wrong with the SECRET, even though I checked that it's still valid...

(mkngff) bash-4.2$ psql -U omero -d idr -h 192.168.10.231 -f idr0036_20585.sql 
BEGIN
psql:idr0036_20585.sql:250032: ERROR:  cannot set original repo property without secret key
CONTEXT:  PL/pgSQL function _protect_originalfile_repo_insert() line 28 at RAISE
SQL statement "insert into originalfile
          (id, permissions, creation_id, group_id, owner_id, update_id, mimetype, repo, path, name)
          values (nextval('seq_originalfile'), old_perms, new_event, old_group, old_owner, new_event,
            info[i][3], repo, info[i][1], uuid || info[i][2])
          returning id"
PL/pgSQL function mkngff_fileset(bigint,character varying,character varying,character varying,text[]) line 42 at SQL statement
ROLLBACK
(mkngff) bash-4.2$ psql -U omero -d idr -h 192.168.10.102
psql (11.16, server 11.14)
Type "help" for help.

idr=> select uuid from (select * from session where node = 0 and owner = 0 and defaulteventtype = 'Sessions' order by id desc limit 1) x order by x.id asc limit 1;
                 uuid                 
--------------------------------------
 4b358149-af39-49f0-882d-10884fab7133
(1 row)

(mkngff) bash-4.2$ cat idr0036_20585.sql | grep 4b358149-af39-49f0-882d-10884fab7133
      '4b358149-af39-49f0-882d-10884fab7133',
will-moore commented 1 year ago

When mounted via goofys, zarrs will be at path like: /bia-integrator-data/S-BIAD815/51afff7c-eed4-44b4-95c7-1437d8807b97/51afff7c-eed4-44b4-95c7-1437d8807b97.zarr/

With scripts at https://github.com/IDR/idr-utils/pull/56 we can generate a csv like idr0051.csv:

idr0051/180712_H2B_22ss_Courtney_p00_c00_reg_preview.klb.ome.zarr,S-BIAD815/51afff7c-eed4-44b4-95c7-1437d8807b97,604306
idr0051/embryo_dmso_2_new_17-00-44_p00_c00_reg_preview.klb.ome.zarr,S-BIAD815/b2633930-86b0-489e-a845-d2a7afe6ff15,604309
idr0051/180712_H2B_22ss_Courtney1_20180712-163837_p00_c00_preview.ome.zarr,S-BIAD815/c49efcfd-e767-4ae5-adbf-299cafd92120,604305
idr0051/2018-06-28_21ss_DMSO_TF_20180628-185945_p00_c00_reg_preview.ome.zarr,S-BIAD815/e12a8e2a-4fce-4579-a78b-b0c4597c3ada,604307

parse this and run mkngff...

$ for r in $(cat idr0051.csv); do
>   biapath=$(echo $r | cut -d',' -f2)
>   uuid=$(echo $biapath | cut -d'/' -f2)
>   fsid=$(echo $r | cut -d',' -f3)
>   omero mkngff sql --symlink_repo /data/OMERO/ManagedRepository --secret=$SECRET $fsid "/bia-integrator-data/$biapath/$uuid.zarr" > "$fsid.sql"
> done

Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/Blitz-0-Ice.ThreadPool.Server-14/2018-11/26 // 10-39-49.639 for fileset 604306
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-14/2018-11/26/10-39-49.639
Creating dir at /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-14/2018-11/26/10-39-49.639_converted/bia-integrator-data/S-BIAD815/51afff7c-eed4-44b4-95c7-1437d8807b97
Creating symlink /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-14/2018-11/26/10-39-49.639_converted/bia-integrator-data/S-BIAD815/51afff7c-eed4-44b4-95c7-1437d8807b97/51afff7c-eed4-44b4-95c7-1437d8807b97.zarr -> /bia-integrator-data/S-BIAD815/51afff7c-eed4-44b4-95c7-1437d8807b97/51afff7c-eed4-44b4-95c7-1437d8807b97.zarr
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/Blitz-0-Ice.ThreadPool.Server-18/2018-11/26 // 10-44-37.527 for fileset 604309
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-18/2018-11/26/10-44-37.527
Creating dir at /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-18/2018-11/26/10-44-37.527_converted/bia-integrator-data/S-BIAD815/b2633930-86b0-489e-a845-d2a7afe6ff15
Creating symlink /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-18/2018-11/26/10-44-37.527_converted/bia-integrator-data/S-BIAD815/b2633930-86b0-489e-a845-d2a7afe6ff15/b2633930-86b0-489e-a845-d2a7afe6ff15.zarr -> /bia-integrator-data/S-BIAD815/b2633930-86b0-489e-a845-d2a7afe6ff15/b2633930-86b0-489e-a845-d2a7afe6ff15.zarr
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/Blitz-0-Ice.ThreadPool.Server-24/2018-11/26 // 10-39-10.551 for fileset 604305
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-24/2018-11/26/10-39-10.551
Creating dir at /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-24/2018-11/26/10-39-10.551_converted/bia-integrator-data/S-BIAD815/c49efcfd-e767-4ae5-adbf-299cafd92120
Creating symlink /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-24/2018-11/26/10-39-10.551_converted/bia-integrator-data/S-BIAD815/c49efcfd-e767-4ae5-adbf-299cafd92120/c49efcfd-e767-4ae5-adbf-299cafd92120.zarr -> /bia-integrator-data/S-BIAD815/c49efcfd-e767-4ae5-adbf-299cafd92120/c49efcfd-e767-4ae5-adbf-299cafd92120.zarr
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/Blitz-0-Ice.ThreadPool.Server-11/2018-11/26 // 10-40-42.186 for fileset 604307
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-11/2018-11/26/10-40-42.186
Creating dir at /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-11/2018-11/26/10-40-42.186_converted/bia-integrator-data/S-BIAD815/e12a8e2a-4fce-4579-a78b-b0c4597c3ada
Creating symlink /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-11/2018-11/26/10-40-42.186_converted/bia-integrator-data/S-BIAD815/e12a8e2a-4fce-4579-a78b-b0c4597c3ada/e12a8e2a-4fce-4579-a78b-b0c4597c3ada.zarr -> /bia-integrator-data/S-BIAD815/e12a8e2a-4fce-4579-a78b-b0c4597c3ada/e12a8e2a-4fce-4579-a78b-b0c4597c3ada.zarr
will-moore commented 1 year ago

This actually worked OK!

But we have very large numbers of OriginalFiles in new Filesets.

http://localhost:1080/webclient/?show=image-4007821 has 398383 Files!

Screenshot 2023-08-18 at 17 46 57

joshmoore commented 12 months ago

Looks like this is now failing with:

      ERROR: Could not find a version that satisfies the requirement hatchling>=1.8.0 (from versions: 0.8.0, 0.8.1, 0.8.2, 0.9.0, 0.10.0, 0.11.0, 0.11.1, 0.11.2, 0.11.3, 0.12.0, 0.13.0, 0.14.0, 0.15.0, 0.16.0, 0.17.0, 0.18.0, 0.19.0, 0.20.0, 0.20.1, 0.21.0, 0.21.1, 0.22.0, 0.23.0, 0.24.0, 0.25.0, 0.25.1)
      ERROR: No matching distribution found for hatchling>=1.8.0

"modern hatchling does not support python3.6" (https://stackoverflow.com/questions/74748995/pre-commit-hook-throws-error-on-hatchling-requirement)

will-moore commented 11 months ago

@joshmoore Josh - pushed a fix. Can you run the workflow for me? Thanks

joshmoore commented 11 months ago

Approved the workflow (but also migrated to the IDR org)

joshmoore commented 11 months ago

Still ERROR: No matching distribution found for hatchling>=1.8.0

will-moore commented 11 months ago

Ah - I see it's actually failing on OMERO integration tests build, so my last commit has no effect. Need python 3.8 or higher there?

joshmoore commented 11 months ago

I've revert your change to the pre-commit config for python 3.8. I'm going to go ahead and get this in so we can see the updated diffs in the other PRs. We can come back to test-infra: that's going to be an issue across many repos.