Open will-moore opened 1 year ago
cc @dgault
The error is the same as in the idr0011 case, namely see details below.
The import of METADATA.ome.xml file was successfull, see https://merge-ci.openmicroscopy.org/web/webclient/?show=plate-66557, user-3
Imported the idr0011 plate on pilot-idrtesting - all looks fine, thumbs and full images are generated, no errors. The OMEZarrReader 0.3.1 was used, see https://github.com/IDR/deployment/pull/380
Time to import that one plate was 5h 28min
This is a good candidate for conversion of a whole study.
68 plates - approx 500 GB total
Create a bash script with a bioformats2raw command for each Plate
Run on pilot-zarr2-dev
NB: Monitor timestamps - start/end. Can investigate memory usage afterwards to decide on future conversion strategies. See https://github.com/glencoesoftware/bioformats2raw#configuring-logging
Then updoad to a new temp idr0012 bucket on EBI uk1 s3.
Dom:
for i in `ls /uod/idr/metadata/idr0012-fuchs-cellmorph/primary/raw/screens/`; do /home/dlindner/bioformats2raw-0.7.0-SNAPSHOT/bin/bioformats2raw --memo-directory ../memo /uod/idr/metadata/idr0012-fuchs-cellmorph/primary/raw/screens/$i ${i%.*}.ome.zarr; done
"conversion of idr0012 is finished, the zarrs are on pilot-zarr1-dev
in /data/idr0012
"
Need to setup minio client
: see https://hackmd.io/Cy5RMIKST_uBXNUfQmeXsA#upload-to-uk1-s3
$ ssh pilot-zarr1-dev
$ wget https://dl.min.io/client/mc/release/linux-amd64/mc
$ ./mc config host add uk1s3 https://uk1s3.embassy.ebi.ac.uk
Enter Access Key: X8GE11ZKP71A8529XFAE
Enter Secret Key:
Added `uk1s3` successfully.
$ ./mc ls uk1s3/idr0012
[2023-04-06 14:29:41 UTC] 0B STANDARD test
$ cd /data
$ ls idr0012
HT01.ome.zarr HT06.ome.zarr...
$ /home/wmoore/mc cp -r idr0012/ uk1s3/idr0012/ngff
See e.g.
Whole plate takes a long time to view but gets there in the end:
Following https://github.com/IDR/idr-metadata/issues/656 with idr0012...
We have already copied the metadata-only plates to /ngff/idr0012/
:
$ ls -alh /ngff/idr0012/HT01.ome.zarr/A/4/0/0
total 4.0K
drwxrwxr-x. 2 omero-server wmoore 21 Apr 7 08:25 .
drwxrwxr-x. 6 omero-server wmoore 72 Apr 7 08:25 ..
-rw-rw-r--. 1 omero-server wmoore 327 Apr 6 16:43 .zarray
And the data is mounted at /idr0012/
with chunks:
$ ls -alh /idr0012/ngff/HT01.ome.zarr/A/4/0/0
total 13K
drwxr-xr-x. 2 root root 4.0K Apr 6 16:43 .
drwxr-xr-x. 2 root root 4.0K Apr 6 14:30 ..
drwxr-xr-x. 2 root root 4.0K Apr 6 14:30 0
-rw-r--r--. 1 root root 327 Apr 6 16:43 .zarray
We want to test import and viewing with ZarrReader fix at https://github.com/ome/ZarrReader/pull/53, get link from lastSuccessfulBuild...
$ sudo -u omero-server -s
$ wget https://merge-ci.openmicroscopy.org/jenkins/job/BIOFORMATS-build/lastSuccessfulBuild/jdk=JDK8,label=testintegration/artifact/bio-formats-build/ZarrReader/target/OMEZarrReader-0.3.2-SNAPSHOT-jar-with-dependencies.jar
$ mv OMEZarrReader-0.3.2-SNAPSHOT-jar-with-dependencies.jar OMEZarrReader.jar
$ rm /opt/omero/server/OMERO.server/lib/client/OMEZarrReader.jar
$ cp OMEZarrReader.jar /opt/omero/server/OMERO.server/lib/client/
$ rm /opt/omero/server/OMERO.server/lib/server/OMEZarrReader.jar
$ cp OMEZarrReader.jar /opt/omero/server/OMERO.server/lib/server/
$ sudo service omero-server restart
Import...
Created Screen: 3204 in webclient...
for dir in *; do
omero import -r 3204 --transfer=ln_s --depth=100 --name=${dir/.ome.zarr/} --skip=all $dir --file /tmp/$dir.log --errs /tmp/$dir.err;
done
Sample import times for first 3 plates:
4725 files uploaded, 1 fileset, 1 plate created, 672 images imported, 0 errors in 0:40:32.612
4725 files uploaded, 1 fileset, 1 plate created, 672 images imported, 0 errors in 0:39:48.270
4725 files uploaded, 1 fileset, 1 plate created, 672 images imported, 0 errors in 0:40:44.291
Updated symlinks for just the first Screen without waiting for all...
$ python idr-utils/scripts/managed_repo_symlinks.py Plate:10451 /idr0012/ngff/ --report
Fileset: 5287051 /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-9/2023-05/11/10-30-57.116/
Render Image 14789072
fs_contents ['HT01.ome.zarr']
Link from /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-9/2023-05/11/10-30-57.116/HT01.ome.zarr to /idr0012/ngff/HT01.ome.zarr
Then viewed the Plate in webclient - and thumbnails were eventually generated correctly!
Once ALL 68 Plates were imported into the Screen, I ran the same command to update symlinks for ALL Plates:
$ python idr-utils/scripts/managed_repo_symlinks.py Screen:3204 /idr0012/ngff/ --report
...
Fileset: 5287118 /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-3/2023-05/13/07-26-04.503/
Render Image 14834092
fs_contents ['HT68.ome.zarr']
Link from /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-3/2023-05/13/07-26-04.503/HT68.ome.zarr to /idr0012/ngff/HT68.ome.zarr
Updated just the first Plate - testing the unlinking of new Images from Fileset.
$ python swap_filesets.py Plate:4287 Plate:10451 > /tmp/swap_fileset_idr0012.psql
$ cat /tmp/swap_fileset_idr0012.psql
UPDATE pixels SET name = 'METADATA.ome.xml', path = 'demo_2/Blitz-0-Ice.ThreadPool.Server-9/2023-05/11/10-30-57.116/HT01.ome.zarr/OME' where image in (select id from Image where fileset = 5287051);
Ran the psql:
idr=> UPDATE pixels SET name = 'METADATA.ome.xml', path = 'demo_2/Blitz-0-Ice.ThreadPool.Server-9/2023-05/11/10-30-57.116/HT01.ome.zarr/OME' where image in (select id from Image where fileset = 5287051);
UPDATE 672
Then deleted New Plate without deleting any Filesets:
$ omero delete Plate:10451 --report
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
omero.cmd.Delete2 Plate:10451 ok
Steps: 6
Elapsed time: 7.702 secs.
Flags: []
Deleted objects
Channel:43999051-44001066
Image:14788401-14789072
LogicalChannel:16074751-16074753
Pixels:14788401-14789072
ChannelBinding:36853561-36854592,36855805-36856797
QuantumDef:13133206-13133549,13133954-13134284
RenderingDef:13133206-13133549,13133954-13134284
Thumbnail:14978256-14978597,14978936-14979266
Plate:10451
ScreenPlateLink:10751
Well:2311551-2311934
WellSample:9188851-9189522
So then ran same command over the whole Screen...
$ python swap_filesets.py Screen:1202 Screen:3204 > /tmp/swap_fileset_idr0012.psql
Ended up stopping this early because tail -f /tmp/swap_fileset_idr0012.psql
was showing nothing (I realise this was because the file wasn't closed at-all). Turns out that it had been working...
$ cat /tmp/swap_fileset_idr0012.psql
UPDATE pixels SET name = 'METADATA.ome.xml', path = 'demo_2/Blitz-0-Ice.ThreadPool.Server-7/2023-05/11/11-13-23.790/HT02.ome.zarr/OME' where image in (select id from Image where fileset = 5287052);
UPDATE pixels SET name = 'METADATA.ome.xml', path = 'demo_2/Blitz-0-Ice.ThreadPool.Server-6/2023-05/11/11-55-03.800/HT03.ome.zarr/OME' where image in (select id from Image where fileset = 5287053);
UPDATE pixels SET name = 'METADATA.ome.xml', path = 'demo_2/Blitz-0-Ice.ThreadPool.Server-1/2023-05/11/12-37-41.346/HT04.ome.zarr/OME' where image in (select id from Image where fileset = 5287054);
UPDATE pixels SET name = 'METADATA.ome.xml', path = 'demo_2/Blitz-0-Ice.ThreadPool.Server-0/2023-05/11/13-21-16.369/HT05.ome.zarr/OME' where image in (select id from Image where fileset = 5287055);
UPDATE pixels SET name = 'METADATA.ome.xml', path = 'demo_2/Blitz-0-Ice.ThreadPool.Server-6/2023-05/11/14-01-11.554/HT06.ome.zarr/OME' where image in (select id from Image where fileset = 5287056);
UPDATE pixels SET name = 'METADATA.ome.xml', path = 'demo_2/Blitz-0-Ice.ThreadPool.Server-4/2023-05/11/14-38-59.564/HT07.ome.zarr/OME' where image in (select id from Image where fileset = 5287057);
UPDATE pixels SET name = 'METADATA.ome.xml', path = 'demo_2/Blitz-0-Ice.ThreadPool.Server-9/2023-05/11/15-17-26.806/HT08.ome.zarr/OME' where image in (select id from Image where fileset = 5287058);
UPDATE pixels SET name = 'METADATA.ome.xml', path = 'demo_2/Blitz-0-Ice.ThreadPool.Server-4/2023-05/11/15-56-45.133/HT09.ome.zarr/OME' where image in (select id from Image where fileset = 5287059);
UPDATE pixels SET name = 'METADATA.ome.xml', path = 'demo_2/Blitz-0-Ice.ThreadPool.Server-6/2023-05/11/16-34-34.252/HT10.ome.zarr/OME' where image in (select id from Image where fileset = 5287060);
UPDATE pixels SET name = 'METADATA.ome.xml', path = 'demo_2/Blitz-0-Ice.ThreadPool.Server-3/2023-05/11/17-13-30.122/HT11.ome.zarr/OME' where image in (select id from Image where fileset = 5287061);
UPDATE pixels SET name = 'METADATA.ome.xml', path = 'demo_2/Blitz-0-Ice.ThreadPool.Server-6/2023-05/11/17-50-19.114/HT12.ome.zarr/OME' where image in (select id from Image where fileset = 5287062);
Then ran
$ PGPASSWORD=*** psql -U omero -d idr -h 192.168.10.102 -f /tmp/swap_fileset_idr0012.psql
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 671
UPDATE 672
UPDATE 599
All Plates are expected to have the same number of Images 21 * 16 * 2 = 672
. This output suggests that Plates HT10 and HT12 the Fileset swap was incomplete for some reason.
The script was probably part-way through Plate HT13 when interrupted.
It would be nice to separate logging (print statements) from the generation of a sql file... Will update script..
Move NGFF Plates HT02
- HT12
to a temp Plate and delete:
$ omero delete Screen:3251 --report
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
omero.cmd.Delete2 Screen:3251 ok
Steps: 6
Elapsed time: 51.142 secs.
Flags: []
Deleted objects
Channel:44001067-44023239
Image:14789073-14796463
LogicalChannel:16074754-16074786
Pixels:14789073-14796463
ChannelBinding:36854593-36854625,36854794-36855804
QuantumDef:13133550-13133560,13133617-13133953
RenderingDef:13133550-13133560,13133617-13133953
Thumbnail:14978598-14978935
Plate:10452-10462
Screen:3251
ScreenPlateLink:10819-10829
Well:2311935-2316158
WellSample:9189523-9196913
Ran script again - failed to write to file in same dir.
python swap_filesets.py Screen:1202 Screen:3204 --report
Permission denied: 'fileset_swap_Screen:1202.sql'
Need to manually create sql for HT14
:
UPDATE pixels SET name = 'METADATA.ome.xml', path = 'demo_2/Blitz-0-Ice.ThreadPool.Server-7/2023-05/11/19-08-10.740/HT14.ome.zarr/OME' where image in (select id from Image where fileset = 5287064);
UPDATE 672
$ omero delete Plate:10464 --report
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
omero.cmd.Delete2 Plate:10464 ok
Steps: 6
Elapsed time: 5.126 secs.
Flags: []
Deleted objects
Channel:44025256-44027271
Image:14797136-14797807
LogicalChannel:16074790-16074792
Pixels:14797136-14797807
ChannelBinding:36854629-36854631
QuantumDef:13133562
RenderingDef:13133562
Plate:10464
ScreenPlateLink:10764
Well:2316543-2316926
WellSample:9197586-9198257
$ python swap_filesets.py Screen:1202 Screen:3204 /tmp/idr0012_filesetswap.sql --report
$ PGPASSWORD=*** psql -U omero -d idr -h 192.168.10.102 -f /tmp/idr0012_filesetswap.sql
could not change directory to "/home/wmoore": Permission denied
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 671
UPDATE 670
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
UPDATE 672
Seems to be Plates HT47
and HT48
where UPDATE count is too low
$ omero delete Screen:3204 --report
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
omero.cmd.Delete2 Screen:3204 ok
Steps: 6
Elapsed time: 241.21 secs.
Flags: []
Deleted objects
Channel:44027272-44136126
Image:14797808-14834092
LogicalChannel:16074793-16074954
Pixels:14797808-14834092
ChannelBinding:36854632-36854793
QuantumDef:13133563-13133616
RenderingDef:13133563-13133616
Plate:10465-10518
Screen:3204
ScreenPlateLink:10765-10818
Well:2316927-2337662
WellSample:9198258-9234542
For HT13
Plate above, some of the Images in the NGFF Plate have no Fileset, indicating that the script failed when updating the NGFF Plate (after updating the old Plate to use the NGFF Fileset).
So, we can manually run the psql to complete Fileset swap for old plate...
UPDATE pixels SET name = 'METADATA.ome.xml', path = 'demo_2/Blitz-0-Ice.ThreadPool.Server-5/2023-05/11/18-29-55.318/HT13.ome.zarr/OME' where image in (select id from Image where fileset = 5287063);
UPDATE 1204
NB: the UPDATE count includes the NGFF Images that weren't processed (Fileset not yet set to None):
Need to unset Fileset from all Images in Plate.
UPDATE image SET fileset = null where id in (select image from WellSample where well in (select id from Well where plate = 10463));
UPDATE 672
$ omero delete Plate:10463 --report
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
omero.cmd.Delete2 Plate:10463 ok
Steps: 6
Elapsed time: 6.273 secs.
Flags: []
Deleted objects
Channel:44023240-44025255
Image:14796464-14797135
LogicalChannel:16074787-16074789
Pixels:14796464-14797135
ChannelBinding:36854626-36854628,36856990-36858000
QuantumDef:13133561,13134349-13134685
RenderingDef:13133561,13134349-13134685
Thumbnail:14979267-14979603
Plate:10463
Well:2316159-2316542
WellSample:9196914-9197585
Using the check_pixels.py
script identified various images in HT12 Plate where pixels weren't loading Resource Error
.
These corresponded to the lower than expected number of Images updated by psql above:
Only 599 images (instead of 672) got updated.
Seems that some Images didn't get their Fileset updated, so they weren't linked to the NGFF fileset and weren't updated. This might have been due to previous attempts to work on this data. All these images seemed to be linked to the same Fileset, so were updated with...
UPDATE image SET fileset = 5287062 where id in (select id from Image where fileset = 5286862);
Then ran the above command again:
UPDATE pixels SET name = 'METADATA.ome.xml', path = 'demo_2/Blitz-0-Ice.ThreadPool.Server-6/2023-05/11/17-50-19.114/HT12.ome.zarr/OME' where image in (select id from Image where fileset = 5287062);
Validated with
$ python check_pixels.py Screen:1202 /tmp/check_pixels_idr0012.log
Check pixels took 44 hours to check 45692 images:
Start: 2023-05-16 16:39:22.139651
Checking Screen:1202
max_planes: 0
0/45692 Check Image:1811247 HT01 [Well A4, Field 1]
1/45692 Check Image:1811248 HT01 [Well A13, Field 1]
2/45692 Check Image:1811249 HT01 [Well A13, Field 2]
...
45688/45692 Check Image:3051564 HT10 [Well M22, Field 2]
45689/45692 Check Image:3051565 HT10 [Well M12, Field 1]
45690/45692 Check Image:3051566 HT10 [Well M12, Field 2]
45691/45692 Check Image:3051567 HT10 [Well A4, Field 2]
End: 2023-05-18 12:04:03.634375
$ cat /tmp/check_pixels_idr0012.log | grep Error
Error: Image:1811288 Z '1' greater than sizeZ '1'.
...
$ cat /tmp/check_pixels_idr0012.log | grep Error | wc
2436 19488 124236
All the C
rows of most plates have that error. 21 C
Wells (2 fields each) in 58 Plates.
2436 / 2 / 21 = 58
(No other Errors!)
Want to upload zip files to BioStudies, but we don't have the original data locally on zarr1-dev any more.
Let's try to create zip from the data mounted via goofys on idr0125-pilot...
$ zip -r HT01.ome.zarr.zip /idr0012/ngff/HT01.ome.zarr
Scanning files .
adding: idr0012/ngff/HT01.ome.zarr/ (stored 0%)
adding: idr0012/ngff/HT01.ome.zarr/.zattrs (deflated 92%)
adding: idr0012/ngff/HT01.ome.zarr/.zgroup (stored 0%)
adding: idr0012/ngff/HT01.ome.zarr/A/ (stored 0%)
...
EDIT: zip creation took > 45 minutes!
ls -lh
-rw-rw-r--. 1 wmoore wmoore 4.8G Jun 7 13:54 HT01.ome.zarr.zip
Uploaded 1 plate to BioStudies (wmoore account):
$ cd ~/.aspera/cli/bin/
$ ./ascp -P33001 -i ../etc/asperaweb_id_dsa.openssh -d ~/HT01.ome.zarr.zip bsaspera_w@hx-fasp-1.ebi.ac.uk:68/5450f4-d575-42df-8b5c-829b2b2e317a-a*****
HT01.ome.zarr.zip 100% 4896MB 123Mb/s 02:13
Completed: 5014454K bytes transferred in 134 seconds
(306272K bits/sec), in 1 file.
as discussed today at IDR meeting: use idr-ftp
machine to generate zips for all plates and upload to BioStudies
Use the /data
mount on idr-ftp
which has 15TB.
$ ssh idr-ftp.openmicroscopy.org
$ cd /data
$ sudo mkdir idr0012_zip_to_biostudies
$ cd idr0012_zip_to_biostudies/
# install goofys to mount s3 bucket...
$ sudo wget https://github.com/kahing/goofys/releases/latest/download/goofys
$ sudo chmod +x ./goofys
$ sudo mkdir ./s3idr0012 && sudo ./goofys --endpoint https://uk1s3.embassy.ebi.ac.uk/ -o allow_other idr0012 ./s3idr0012
$ ls ./s3idr0012/ngff/
HT01.ome.zarr HT07.ome.zarr HT13.ome.zarr HT19.ome.zarr ...
$ df -h ./
Filesystem Size Used Avail Use% Mounted on
/dev/vdb 15T 277G 15T 2% /data
$ sudo zip -r HT02.ome.zarr.zip ./s3idr0012/ngff/HT02.ome.zarr
Run zip for all other plates in a Screen
for i in $(seq -f "%02g" 3 68);
do sudo zip -r HT$i.ome.zarr.zip ./s3idr0012/ngff/HT$i.ome.zarr
done
Install Aspera...
$ sudo wget https://ak-delivery04-mul.dhe.ibm.com/sar/CMA/OSA/08q6g/0/ibm-aspera-cli-3.9.6.1467.159c5b1-linux-64-release.sh
$ sudo sh ./ibm-aspera-cli-3.9.6.1467.159c5b1-linux-64-release.sh
Installing IBM Aspera CLI
Installation into /root/.aspera/cli successful
Optional installation steps:
To include aspera in your PATH, run this command (or add it to .bash_profile):
export PATH=/root/.aspera/cli/bin:$PATH
To install the man page, run the following command:
export MANPATH=/root/.aspera/cli/share/man:$MANPATH
Uploaded 1 more plate to BioStudies (wmoore account). Working OK...
$ sudo /root/.aspera/cli/bin/ascp -P33001 -i /root/.aspera/cli/etc/asperaweb_id_dsa.openssh -d HT02.ome.zarr.zip bsaspera_w@hx-fasp-1.ebi.ac.uk:68/5450f4-d575-42df-8b5c-829b2b2e317a-a*****
Completed: 4979572K bytes transferred in 85 seconds
(479052K bits/sec), in 1 file.
On ssh idr-ftp.openmicroscopy.org
I generated the missing HT01.ome.zarr.zip
as above.
sudo zip -r HT01.ome.zarr.zip ./s3idr0012/ngff/HT01.ome.zarr
Uploading zips to BioStudies IDR account...
ssh idr-ftp.openmicroscopy.org
cd /data/idr0012_zip_to_biostudies/
mkdir idr0012
mv HT* idr0012
screen -S idr0012_ftp
sudo /root/.aspera/cli/bin/ascp -P33001 -i /root/.aspera/cli/etc/asperaweb_id_dsa.openssh -d idr0012 bsaspera_w@hx-fasp-1.ebi.ac.uk:/5f/136e8d-xxxxxxxxxxxxx
Page available, but currently only 6 out of 68 plates are "viewable" https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/pages/S-BIAD845.html
Working on idr0125-pilot, where we previously imported NGFF Plates and swapped Filesets...
Going to test mkngff
workflow with the 6 Plates available on BioStudies s3...
Following https://github.com/joshmoore/omero-mkngff/issues/2 workflow..
With Fileset IDs: idr0012.csv
:
idr0012/HT02.ome.zarr,S-BIAD845/00e3b790-29f4-4cfe-84b2-52ec1eae90d5,5287052
idr0012/HT29.ome.zarr,S-BIAD845/00ee63d5-5800-4ea6-aab2-8d23a9c352f6,5287079
idr0012/HT20.ome.zarr,S-BIAD845/05a03696-ccf5-4262-b5e3-092d9633d2e2,5287070
idr0012/HT42.ome.zarr,S-BIAD845/05d010ca-c155-479d-a3b3-51731e4aca3f,5287092
idr0012/HT37.ome.zarr,S-BIAD845/09753a8e-259f-417d-aef1-bc1136349dc9,5287087
idr0012/HT04.ome.zarr,S-BIAD845/0add8964-a383-4c5f-a73a-f10dc02d49ff,5287054
NB: I had to manually update the Fileset IDs in csv above from idr0125 since the IDs from https://github.com/IDR/idr-utils/pull/56 are for vanilla IDR.
this ran in a few seconds...
for r in $(cat idr0012.csv); do
biapath=$(echo $r | cut -d',' -f2)
uuid=$(echo $biapath | cut -d'/' -f2)
fsid=$(echo $r | cut -d',' -f3)
omero mkngff sql --symlink_repo /data/OMERO/ManagedRepository --secret=$SECRET $fsid "/bia-integrator-data/$biapath/$uuid.zarr" > "$fsid.sql"
done
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/Blitz-0-Ice.ThreadPool.Server-7/2023-05/11 // 11-13-23.790 for fileset 5287052
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-7/2023-05/11/11-13-23.790
Creating dir at /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-7/2023-05/11/11-13-23.790_converted/bia-integrator-data/S-BIAD845/00e3b790-29f4-4cfe-84b2-52ec1eae90d5
Creating symlink /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-7/2023-05/11/11-13-23.790_converted/bia-integrator-data/S-BIAD845/00e3b790-29f4-4cfe-84b2-52ec1eae90d5/00e3b790-29f4-4cfe-84b2-52ec1eae90d5.zarr -> /bia-integrator-data/S-BIAD845/00e3b790-29f4-4cfe-84b2-52ec1eae90d5/00e3b790-29f4-4cfe-84b2-52ec1eae90d5.zarr
...
for r in $(cat idr0012.csv); do
fsid=$(echo $r | cut -d',' -f3)
psql -U omero -d idr -h 192.168.10.102 -f "$fsid.sql"
done
However, this gave sql errors as the sql was invalid (no rows in array[])
$ cat 5287052.sql
begin;
select mkngff_fileset(
5287052,
'4b358149-af39-49f0-882d-10884fab7133',
'cdf35825-def1-4580-8d0b-9c349b8f78d6',
'demo_2/Blitz-0-Ice.ThreadPool.Server-7/2023-05/11/11-13-23.790_converted/',
array[
]::text[][]
);
commit;
This is due to the bia paths having extra ngff/HT02.ome.zarr
added at the end... E.g.
This was caused by the zipping from s3-mounted directory above...
$ sudo zip -r HT02.ome.zarr.zip ./s3idr0012/ngff/HT02.ome.zarr
If we want to work with this data on BioStudies s3, we need to add /ngff/HT02.ome.zarr
to the path...
Deleted some symlinks and tried again...
for r in $(cat idr0012.csv); do
biapath=$(echo $r | cut -d',' -f2)
uuid=$(echo $biapath | cut -d'/' -f2)
fsid=$(echo $r | cut -d',' -f3)
idrpath=$(echo $r | cut -d',' -f1)
zarrname=$(echo $idrpath | cut -d'/' -f2)
omero mkngff sql --symlink_repo /data/OMERO/ManagedRepository --secret=$SECRET $fsid "/bia-integrator-data/$biapath/$uuid.zarr/ngff/$zarrname" > "$fsid.sql"
done
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/Blitz-0-Ice.ThreadPool.Server-2/2023-05/11 // 22-57-46.368 for fileset 5287070
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-2/2023-05/11/22-57-46.368
Creating dir at /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-2/2023-05/11/22-57-46.368_converted/bia-integrator-data/S-BIAD845/05a03696-ccf5-4262-b5e3-092d9633d2e2/05a03696-ccf5-4262-b5e3-092d9633d2e2.zarr/ngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-2/2023-05/11/22-57-46.368_converted/bia-integrator-data/S-BIAD845/05a03696-ccf5-4262-b5e3-092d9633d2e2/05a03696-ccf5-4262-b5e3-092d9633d2e2.zarr/ngff/HT20.ome.zarr -> /bia-integrator-data/S-BIAD845/05a03696-ccf5-4262-b5e3-092d9633d2e2/05a03696-ccf5-4262-b5e3-092d9633d2e2.zarr/ngff/HT20.ome.zarr
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/Blitz-0-Ice.ThreadPool.Server-7/2023-05/12 // 12-54-57.325 for fileset 5287092
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-7/2023-05/12/12-54-57.325
Creating dir at /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-7/2023-05/12/12-54-57.325_converted/bia-integrator-data/S-BIAD845/05d010ca-c155-479d-a3b3-51731e4aca3f/05d010ca-c155-479d-a3b3-51731e4aca3f.zarr/ngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-7/2023-05/12/12-54-57.325_converted/bia-integrator-data/S-BIAD845/05d010ca-c155-479d-a3b3-51731e4aca3f/05d010ca-c155-479d-a3b3-51731e4aca3f.zarr/ngff/HT42.ome.zarr -> /bia-integrator-data/S-BIAD845/05d010ca-c155-479d-a3b3-51731e4aca3f/05d010ca-c155-479d-a3b3-51731e4aca3f.zarr/ngff/HT42.ome.zarr
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/Blitz-0-Ice.ThreadPool.Server-4/2023-05/12 // 09-43-47.162 for fileset 5287087
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-4/2023-05/12/09-43-47.162
Creating dir at /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-4/2023-05/12/09-43-47.162_converted/bia-integrator-data/S-BIAD845/09753a8e-259f-417d-aef1-bc1136349dc9/09753a8e-259f-417d-aef1-bc1136349dc9.zarr/ngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-4/2023-05/12/09-43-47.162_converted/bia-integrator-data/S-BIAD845/09753a8e-259f-417d-aef1-bc1136349dc9/09753a8e-259f-417d-aef1-bc1136349dc9.zarr/ngff/HT37.ome.zarr -> /bia-integrator-data/S-BIAD845/09753a8e-259f-417d-aef1-bc1136349dc9/09753a8e-259f-417d-aef1-bc1136349dc9.zarr/ngff/HT37.ome.zarr
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/Blitz-0-Ice.ThreadPool.Server-1/2023-05/11 // 12-37-41.346 for fileset 5287054
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-1/2023-05/11/12-37-41.346
Creating dir at /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-1/2023-05/11/12-37-41.346_converted/bia-integrator-data/S-BIAD845/0add8964-a383-4c5f-a73a-f10dc02d49ff/0add8964-a383-4c5f-a73a-f10dc02d49ff.zarr/ngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-1/2023-05/11/12-37-41.346_converted/bia-integrator-data/S-BIAD845/0add8964-a383-4c5f-a73a-f10dc02d49ff/0add8964-a383-4c5f-a73a-f10dc02d49ff.zarr/ngff/HT04.ome.zarr -> /bia-integrator-data/S-BIAD845/0add8964-a383-4c5f-a73a-f10dc02d49ff/0add8964-a383-4c5f-a73a-f10dc02d49ff.zarr/ngff/HT04.ome.zarr
for r in $(cat idr0012.csv); do
> fsid=$(echo $r | cut -d',' -f3)
> psql -U omero -d idr -h 192.168.10.102 -f "$fsid.sql"
> done
BEGIN
mkngff_fileset
----------------
5287442
(1 row)
COMMIT
BEGIN
mkngff_fileset
----------------
5287443
(1 row)
COMMIT
BEGIN
mkngff_fileset
----------------
5287444
(1 row)
COMMIT
BEGIN
mkngff_fileset
----------------
5287445
(1 row)
COMMIT
Checked HT20 Plate:
Blitz log... '.zgroup' expected but is not readable or missing in store.
(see below) but don't know which .zgroup
is causing this. They all seem to be there
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:333)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:190)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
at omero.cmd.CallContext.invoke(CallContext.java:85)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:213)
at com.sun.proxy.$Proxy123.load_async(Unknown Source)
at omero.api._RenderingEngineTie.load_async(_RenderingEngineTie.java:248)
at omero.api._RenderingEngineDisp.___load(_RenderingEngineDisp.java:1223)
at omero.api._RenderingEngineDisp.__dispatch(_RenderingEngineDisp.java:2405)
at IceInternal.Incoming.invoke(Incoming.java:221)
at Ice.ConnectionI.invokeAll(ConnectionI.java:2536)
at Ice.ConnectionI.dispatch(ConnectionI.java:1145)
at Ice.ConnectionI.message(ConnectionI.java:1056)
at IceInternal.ThreadPool.run(ThreadPool.java:395)
at IceInternal.ThreadPool.access$300(ThreadPool.java:12)
at IceInternal.ThreadPool$EventHandlerThread.run(ThreadPool.java:832)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.io.IOException: '.zgroup' expected but is not readable or missing in store.
at com.bc.zarr.ZarrGroup.validateGroupToBeOpened(ZarrGroup.java:110)
at com.bc.zarr.ZarrGroup.open(ZarrGroup.java:103)
at com.bc.zarr.ZarrGroup.open(ZarrGroup.java:96)
at com.bc.zarr.ZarrGroup.open(ZarrGroup.java:88)
at loci.formats.services.JZarrServiceImpl.getGroupAttr(JZarrServiceImpl.java:104)
at loci.formats.in.ZarrReader.initFile(ZarrReader.java:175)
at loci.formats.FormatReader.setId(FormatReader.java:1443)
at loci.formats.ImageReader.setId(ImageReader.java:849)
at ome.io.nio.PixelsService$3.setId(PixelsService.java:869)
at loci.formats.ReaderWrapper.setId(ReaderWrapper.java:650)
at loci.formats.ChannelFiller.setId(ChannelFiller.java:234)
at loci.formats.ReaderWrapper.setId(ReaderWrapper.java:650)
at loci.formats.ChannelSeparator.setId(ChannelSeparator.java:293)
at loci.formats.ReaderWrapper.setId(ReaderWrapper.java:650)
at loci.formats.Memoizer.setId(Memoizer.java:690)
at ome.io.bioformats.BfPixelsWrapper.<init>(BfPixelsWrapper.java:52)
at ome.io.bioformats.BfPixelBuffer.reader(BfPixelBuffer.java:73)
... 82 common frames omitted
2023-08-24 12:45:26,172 INFO [ org.perf4j.TimingLogger] (l.Server-2) start[1692881125796] time[376] tag[omero.call.exception]
2023-08-24 12:45:26,172 INFO [ ome.services.util.ServiceHandler] (l.Server-2) Excp: ome.conditions.ResourceError: Error instantiating pixel buffer: /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-2/2023-05/11/22-57-46.368_converted/bia-integrator-data/S-BIAD845/05a03696-ccf5-4262-b5e3-092d9633d2e2/05a03696-ccf5-4262-b5e3-092d9633d2e2.zarr/ngff/HT20.ome.zarr/OME/METADATA.ome.xml
E.g.
ls -alh /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-2/2023-05/11/22-57-46.368_converted/bia-integrator-data/S-BIAD845/05a03696-ccf5-4262-b5e3-092d9633d2e2/05a03696-ccf5-4262-b5e3-092d9633d2e2.zarr/ngff/HT20.ome.zarr/OME/
drwxr-xr-x. 2 root root 4.0K Aug 24 09:20 .
drwxr-xr-x. 2 root root 4.0K Aug 24 09:20 ..
-rw-r--r--. 1 root root 473K Aug 24 09:20 METADATA.ome.xml
-rw-r--r--. 1 root root 6.4K Aug 24 09:20 .zattrs
-rw-r--r--. 1 root root 23 Aug 24 09:20 .zgroup
ls -alh /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-2/2023-05/11/22-57-46.368_converted/bia-integrator-data/S-BIAD845/05a03696-ccf5-4262-b5e3-092d9633d2e2/05a03696-ccf5-4262-b5e3-092d9633d2e2.zarr/ngff/HT20.ome.zarr/
total 104K
drwxr-xr-x. 2 root root 4.0K Aug 24 09:20 .
drwxr-xr-x. 2 root root 4.0K May 29 11:11 ..
drwxr-xr-x. 2 root root 4.0K Aug 24 09:17 A
drwxr-xr-x. 2 root root 4.0K Aug 24 09:17 B
drwxr-xr-x. 2 root root 4.0K Aug 24 09:17 C
drwxr-xr-x. 2 root root 4.0K Aug 24 09:17 D
drwxr-xr-x. 2 root root 4.0K Aug 24 09:18 E
drwxr-xr-x. 2 root root 4.0K Aug 24 09:18 F
drwxr-xr-x. 2 root root 4.0K Aug 24 09:18 G
drwxr-xr-x. 2 root root 4.0K Aug 24 09:18 H
drwxr-xr-x. 2 root root 4.0K Aug 24 09:18 I
drwxr-xr-x. 2 root root 4.0K Aug 24 09:19 J
drwxr-xr-x. 2 root root 4.0K Aug 24 09:19 K
drwxr-xr-x. 2 root root 4.0K Aug 24 09:19 L
drwxr-xr-x. 2 root root 4.0K Aug 24 09:19 M
drwxr-xr-x. 2 root root 4.0K Aug 24 09:20 N
drwxr-xr-x. 2 root root 4.0K Aug 24 09:20 O
drwxr-xr-x. 2 root root 4.0K Aug 24 09:20 OME
drwxr-xr-x. 2 root root 4.0K Aug 24 09:20 P
-rw-r--r--. 1 root root 28K Aug 24 09:17 .zattrs
-rw-r--r--. 1 root root 23 Aug 24 09:17 .zgroup
Ahh - I wonder if this is caused by the path having 2 .zarr
s in it:
/data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-2/2023-05/11/22-57-46.368_converted/bia-integrator-data/S-BIAD845/05a03696-ccf5-4262-b5e3-092d9633d2e2/05a03696-ccf5-4262-b5e3-092d9633d2e2.zarr/ngff/HT20.ome.zarr/
If so, and we want to fix that path, maybe we need to resubmit - unless EBI can do it for us?
/data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-2/2023-05/11/22-57-46.368_converted/bia-integrator-data/S-BIAD845/05a03696-ccf5-4262-b5e3-092d9633d2e2/05a03696-ccf5-4262-b5e3-092d9633d2e2.zarr/ngff/HT20.ome.zarr/
That's a deep file path. Have we established this is truly necessary or could we truncate it up to the top-level .zarr
directory ? The regular import logic aims to build the shortest common denominator containing all the files (based on Bio-Formats getRequiredDirectories
API).
I don't think it's necessary to have such a long path. I could open a PR to try and shorten the path, although it will be on top of my other 2 PRs!
Installed branch from that PR, on idr0125-pilot
...
sudo -u omero-server -s
conda activate mkngff
pip uninstall omero-mkngff
pip install 'omero-mkngff @ git+https://github.com/will-moore/omero-mkngff@shortern_paths_and_symlinks'
As above... with extra path within the zip...
for r in $(cat idr0012.csv); do
biapath=$(echo $r | cut -d',' -f2)
uuid=$(echo $biapath | cut -d'/' -f2)
fsid=$(echo $r | cut -d',' -f3)
idrpath=$(echo $r | cut -d',' -f1)
zarrname=$(echo $idrpath | cut -d'/' -f2)
omero mkngff sql --symlink_repo /data/OMERO/ManagedRepository --secret=$SECRET $fsid "/bia-integrator-data/$biapath/$uuid.zarr/ngff/$zarrname" > "$fsid.sql"
done
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/Blitz-0-Ice.ThreadPool.Server-2/2023-05/11 // 22-57-46.368 for fileset 5287070
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-2/2023-05/11/22-57-46.368
Creating dir at /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-2/2023-05/11/22-57-46.368_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-2/2023-05/11/22-57-46.368_mkngff/HT20.ome.zarr -> /bia-integrator-data/S-BIAD845/05a03696-ccf5-4262-b5e3-092d9633d2e2/05a03696-ccf5-4262-b5e3-092d9633d2e2.zarr/ngff/HT20.ome.zarr
Ran the sql scripts...
$ for r in $(cat idr0012.csv); do
> fsid=$(echo $r | cut -d',' -f3)
> psql -U omero -d idr -h $DBHOST -f "$fsid.sql"
> done
But this didn't seem to update Filesets on idr0125-pilot. May have out-of data Fileset IDs.
Start again, setting these Fileset IDs from the webclient...
idr0012.csv:
idr0012/HT20.ome.zarr,S-BIAD845/05a03696-ccf5-4262-b5e3-092d9633d2e2,5287442
idr0012/HT42.ome.zarr,S-BIAD845/05d010ca-c155-479d-a3b3-51731e4aca3f,5287443
idr0012/HT37.ome.zarr,S-BIAD845/09753a8e-259f-417d-aef1-bc1136349dc9,5287444
idr0012/HT04.ome.zarr,S-BIAD845/0add8964-a383-4c5f-a73a-f10dc02d49ff,5287445
for r in $(cat idr0012.csv); do
biapath=$(echo $r | cut -d',' -f2)
uuid=$(echo $biapath | cut -d'/' -f2)
fsid=$(echo $r | cut -d',' -f3)
idrpath=$(echo $r | cut -d',' -f1)
zarrname=$(echo $idrpath | cut -d'/' -f2)
omero mkngff sql --symlink_repo /data/OMERO/ManagedRepository --secret=$SECRET $fsid "/bia-integrator-data/$biapath/$uuid.zarr/ngff/$zarrname" > "$fsid.sql"
done
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/Blitz-0-Ice.ThreadPool.Server-2/2023-05/11 // 22-57-46.368_converted for fileset 5287442
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-2/2023-05/11/22-57-46.368_converted
Creating dir at /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-2/2023-05/11/22-57-46.368_converted_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-2/2023-05/11/22-57-46.368_converted_mkngff/HT20.ome.zarr -> /bia-integrator-data/S-BIAD845/05a03696-ccf5-4262-b5e3-092d9633d2e2/05a03696-ccf5-4262-b5e3-092d9633d2e2.zarr/ngff/HT20.ome.zarr
...
for r in $(cat idr0012.csv); do
fsid=$(echo $r | cut -d',' -f3)
psql -U omero -d idr -h $DBHOST -f "$fsid.sql"
done
BEGIN
mkngff_fileset
----------------
5811563
(1 row)
COMMIT
BEGIN
mkngff_fileset
----------------
5811564
(1 row)
COMMIT
BEGIN
mkngff_fileset
----------------
5811565
(1 row)
COMMIT
BEGIN
mkngff_fileset
----------------
5811566
(1 row)
COMMIT
Seems to be that the $DBHOST
got set wrong above, which is why some sql
updates had "no effect".
Start again... on idr0125-pilot...
Update all variables:
$ echo $DBHOST
192.168.10.102
$ echo $PGPASSWORD
2NGS5rrKNqsnR6sasDLa71CG+IpoBmIoQwEVXarc2cto
$ echo $SECRET
22c41bb8-36e5-4386-9825-179b180d8238
Check that we now get the correct Fileset ID from Image ID on plate HT20
:
$ psql -U omero -d idr -h $DBHOST -c "select fileset from image where id = 1824135"
fileset
---------
5287442
(1 row)
Unchanged idr0012.csv:
idr0012/HT20.ome.zarr,S-BIAD845/05a03696-ccf5-4262-b5e3-092d9633d2e2,5287442
idr0012/HT42.ome.zarr,S-BIAD845/05d010ca-c155-479d-a3b3-51731e4aca3f,5287443
idr0012/HT37.ome.zarr,S-BIAD845/09753a8e-259f-417d-aef1-bc1136349dc9,5287444
idr0012/HT04.ome.zarr,S-BIAD845/0add8964-a383-4c5f-a73a-f10dc02d49ff,5287445
Ran this without --symlink_repo /data/OMERO/ManagedRepository
since symlinks were created above:
for r in $(cat idr0012.csv); do
biapath=$(echo $r | cut -d',' -f2)
uuid=$(echo $biapath | cut -d'/' -f2)
fsid=$(echo $r | cut -d',' -f3)
idrpath=$(echo $r | cut -d',' -f1)
zarrname=$(echo $idrpath | cut -d'/' -f2)
omero mkngff sql --secret=$SECRET $fsid "/bia-integrator-data/$biapath/$uuid.zarr/ngff/$zarrname" > "$fsid.sql"
done
mkngff sql
took about 6 minutes per Plate:
-rw-r--r--. 1 omero-server omero-server 1.2M Aug 29 11:15 5287442.sql
-rw-r--r--. 1 omero-server omero-server 1.2M Aug 29 11:21 5287443.sql
-rw-r--r--. 1 omero-server omero-server 1.2M Aug 29 11:27 5287444.sql
-rw-r--r--. 1 omero-server omero-server 1.2M Aug 29 11:32 5287445.sql
$ for r in $(cat idr0012.csv); do
> fsid=$(echo $r | cut -d',' -f3)
> psql -U omero -d idr -h $DBHOST -f "$fsid.sql"
> done
BEGIN
mkngff_fileset
----------------
5287451
(1 row)
COMMIT
BEGIN
mkngff_fileset
----------------
5287452
(1 row)
COMMIT
BEGIN
mkngff_fileset
----------------
5287453
(1 row)
COMMIT
BEGIN
mkngff_fileset
----------------
5287454
(1 row)
COMMIT
Check image above has new Fileset...
$ psql -U omero -d idr -h $DBHOST -c "select fileset from image where id = 1824135"
fileset
---------
5287451
(1 row)
This took best part of an hour to allow Images to render... But looks good (no issues with Well ordering) on idr0125-pilot...
To see how long memo file regeneration took...
grep -A 2 "saved memo" /opt/omero/server/OMERO.server/var/log/Blitz-0.log
2023-08-29 12:21:51,993 DEBUG [ loci.formats.Memoizer] (l.Server-4) saved memo file: /data/OMERO/BioFormatsCache/data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-2/2023-05/11/22-57-46.368_converted_mkngff/HT20.ome.zarr/OME/.METADATA.ome.xml.bfmemo (3838714 bytes)
2023-08-29 12:21:51,993 DEBUG [ loci.formats.Memoizer] (l.Server-4) start[1693309192879] time[2519114] tag[loci.formats.Memoizer.setId]
2023-08-29 12:21:51,995 INFO [ ome.io.nio.PixelsService] (l.Server-4) Creating BfPixelBuffer: /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-2/2023-05/11/22-57-46.368_converted_mkngff/HT20.ome.zarr/OME/METADATA.ome.xml Series: 0
Time of 2519114
ms is 42 minutes.
We want to recreate the zips as above (downloading from our own ebi s3 idr0012 bucket) but without the extra dirs introduced above.
Easiest is to download to idr-ftp
first, then create zips in the same directory.
screen -r idr0012_zip
cd /data/idr0012_zip_to_biostudies
for i in $(seq -f "%02g" 1 68);
do sudo cp -r ./s3idr0012/ngff/HT$i.ome.zarr ./
done
for i in */; do zip -mr "${i%/}.zip" "$i"; done
...
Deleted all zips at https://www.ebi.ac.uk/biostudies/submissions/files?path=%2Fuser%2Fidr0012
Started to upload replacements...
sudo /root/.aspera/cli/bin/ascp -P33001 -i /root/.aspera/cli/etc/asperaweb_id_dsa.openssh -d idr0012 bsaspera_w@hx-fasp-1.ebi.ac.uk:/5f/xxxxxxxxxx
Using latest updated omero-mkngff
with .zarray
fix and default SECRETUUID
...
On idr-testing... with idr0012.csv
(venv3) bash-4.2$ for r in $(cat $IDRID.csv); do
> biapath=$(echo $r | cut -d',' -f2)
> uuid=$(echo $biapath | cut -d'/' -f2)
> fsid=$(echo $r | cut -d',' -f3)
> omero mkngff sql $fsid "/bia-integrator-data/$biapath/$uuid.zarr" >> "$IDRID/$fsid.sql"
> done
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix: demo_2/2016-05/16/21-48-10.531 for fileset: 19259
...
...all done!
mv idr0012.csv idr0012/
zip -r idr0012.zip idr0012
cd idr0012
for i in $(ls); do sed -i 's/SECRETUUID/fc5d3566-eea0-412c-849e-daa6d3c6bfcc/g' $i; done
for r in $(cat $IDRID.csv); do
biapath=$(echo $r | cut -d',' -f2)
uuid=$(echo $biapath | cut -d'/' -f2)
fsid=$(echo $r | cut -d',' -f3)
psql -U omero -d idr -h $DBHOST -f "$fsid.sql"
omero mkngff symlink /data/OMERO/ManagedRepository $fsid "/bia-integrator-data/$biapath/$uuid.zarr"
done
...
UPDATE 672
BEGIN
mkngff_fileset
----------------
6312207
(1 row)
COMMIT
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/17/01-03-42.769
Creating dir at /data/OMERO/ManagedRepository/demo_2/2016-05/17/01-03-42.769_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2016-05/17/01-03-42.769_mkngff/fa71a2b4-c90b-49fd-a18d-ce3afcb6927a.zarr -> /bia-integrator-data/S-BIAD845/fa71a2b4-c90b-49fd-a18d-ce3afcb6927a/fa71a2b4-c90b-49fd-a18d-ce3afcb6927a.zarr
Running on idr-testing
as wmoore
user...
(venv3) [wmoore@test120-omeroreadwrite ~]$ for r in $(cat $IDRID.csv); do
> biapath=$(echo $r | cut -d',' -f2)
> uuid=$(echo $biapath | cut -d'/' -f2)
> fsid=$(echo $r | cut -d',' -f3)
> omero mkngff sql $fsid "/bia-integrator-data/$biapath/$uuid.zarr" >> "$IDRID/$fsid.sql"
> done
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix: demo_2/2016-05/16/21-48-10.531 for fileset: 19259
Had to remount goofys (didn't need server restart) for the last 5 filesets...
$ for r in $(cat $IDRID.csv); do biapath=$(echo $r | cut -d',' -f2); uuid=$(echo $biapath | cut -d'/' -f2); fsid=$(echo $r | cut -d',' -f3); omero mkngff sql $fsid "/bia-integrator-data/$biapath/$uuid.zarr" >> "$IDRID/$fsid.sql"; done
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix: demo_2/2016-05/17/04-28-52.197 for fileset: 19279
idr0012-fuchs-cellmorph