Open will-moore opened 1 year ago
Issue with conversion:
(base) [dlindner@pilot-zarr2-dev idr0091]$ time /home/dlindner/bioformats2raw/bin/bioformats2raw --memo-directory ../memo /uod/idr/filesets/idr0091-julou-lacinduction/20200622-ftp/Julou_2020_lacInduction_RawImages/20170919/20170919_glyc_lac_1/20170919_glyc_lac_1_MMStack_metadata.txt 20170919_glyc_lac_1_MMStack.ome.zarr
OpenJDK 64-Bit Server VM warning: You have loaded library /tmp/opencv_openpnp4590289654988610984/nu/pattern/opencv/linux/x86_64/libopencv_java342.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
2023-02-22 13:49:12,806 [main] ERROR loci.formats.Memoizer - deleting invalid memo file: ../memo/uod/idr/filesets/idr0091-julou-lacinduction/20200622-ftp/Julou_2020_lacInduction_RawImages/20170919/20170919_glyc_lac_1/.20170919_glyc_lac_1_MMStack_metadata.txt.bfmemo
java.lang.OutOfMemoryError: GC overhead limit exceeded
at ome.xml.model.Annotation.<init>(Annotation.java:123)
at ome.xml.model.TextAnnotation.<init>(TextAnnotation.java:91)
at ome.xml.model.XMLAnnotation.<init>(XMLAnnotation.java:97)
I even tried with export BF_MAX_MEM=56G
But watching the process never got over 20G mem usage before crashing.
Pretty sure that BF_MAX_MEM
is specific to the Bio-Formats command-line utilities and will not be recognized by bioformats2raw
. Have you tried JAVA_OPTS="-Xmx<NN>G"
?
👍 It finally worked with export JAVA_OPTS="-Xmx50G"
!
50G definitely feels excessive. I recall some improvements were targeting at handling similar issues for large Micro-Manager metadata files in the past. One thing possibly worth testing independently is whether bioformats2raw 0.6.0
would handle the same data will lower memory requirements /cc @melissalinkert
Semi-related, I would expect this particular file format to work without issues with OMERO 5.6.6. What is our policy for these types of submissions of mixed file formats (probably only a handful of them)? Are we converting everything or only the minimal amount of data? /cc @jburel
Oh, I should test a different image then. Didn't notice that this submission had different file formats.
https://github.com/ome/bioformats/pull/3229 is the last time we addressed memory issues in Micro-Manager, so I'd be surprised if bioformats2raw 0.6.0 helps. Based on the partial stack trace, I'd guess it's original metadata annotations that are causing the problem.
Comparing memory usage for showinf -nopix -omexml /uod/idr/filesets/idr0091-julou-lacinduction/20200622-ftp/Julou_2020_lacInduction_RawImages/20170919/20170919_glyc_lac_1/20170919_glyc_lac_1_MMStack_metadata.txt
and showinf -nopix -omexml -no-sas /uod/idr/filesets/idr0091-julou-lacinduction/20200622-ftp/Julou_2020_lacInduction_RawImages/20170919/20170919_glyc_lac_1/20170919_glyc_lac_1_MMStack_metadata.txt
should confirm whether that is indeed the issue.
Also converted one of the pattern files, and re-imported. Worked fine. But the converted MMStack can't be re-imported, also memory issue:
2023-03-07 11:54:22,437 17151 [ main] ERROR ome.formats.importer.cli.ErrorHandler - FILE_EXCEPTION: /data/ngff/idr0091/20170920_glyc_lac_6h_1_MMStack.ome.zarr/OME/METADATA.ome.xml
java.lang.Exception: java.lang.OutOfMemoryError: GC overhead limit exceeded
@dominikl - are you able to try the --no-sas
option suggested by @melissalinkert above and see if that affects memory usage?
@melissalinkert If that is the case, does it suggest a workaround for bioformats2raw
or is a fix still a much bigger issue?
A possible option is to use omero-cli-zarr
to export since it's only 342 Images (according to https://github.com/IDR/idr-utils/pull/56)
bioformats2raw does not have a direct equivalent to bfconvert's -no-sas
. The closest workaround at the moment is bioformats2raw --no-ome-meta-export
, which entirely prevents OME/METADATA.ome.xml
from being written; that's likely not what you want. I'm not opposed to adding an equivalent to -no-sas
in bioformats2raw, but would like to know if that actually would solve the problem first.
Going to start exporting with omero-cli-zarr
since I can also do this on the idr-ftp
machine which doesn't have the raw data mounted...
$ ssh -A idr-ftp.openmicroscopy.org
$ conda create -n omero_zarr_export -c ome python=3.9 zeroc-ice36-python
$ conda activate omero_zarr_export
$ conda install -c conda-forge omero-py
$ pip install git+https://github.com/will-moore/omero-cli-zarr.git@name_option
...
omero-cli-zarr-0.1.dev452+ge882a62
cd /data/ngff/
mkdir idr0091 && cd idr0091
Export 100 images
omero login
for id in 10648046 10648047 10648048 10648049 10648050 10648051 10648052 10648053 10648054 10648055 10648056 10648057 10648058 10648059 10648060 10648061 10648062 10648063 10648064 10648065 10648066 10648067 10648068 10648069 10648070 10648071 10648072 10648073 10648074 10648075 10648076 10648077 10648078 10648079 10648080 10648081 10648082 10648083 10648084 10648085 10648086 10648087 10648088 10648089 10648090 10648091 10648092 10648093 10648094 10648095 10648096 10648097 10648098 10648099 10648100 10648101 10648102 10648103 10648104 10648317 10648318 10648319 10648320 10648321 10648322 10648323 10648324 10648325 10648326 10648327 10648328 10648329 10648330 10648331 10648332 10648333 10648334 10648335 10648336 10648337 10648338 10648339 10648340 10648341 10648342 10648343 10648344 10648345 10648346 10648347 10648196 10648197 10648198 10648199 10648200 10648201 10648202 10648203 10648204 10648205; do
echo $id;
omero zarr export Image:$id --name_by name;
done
After about 17 hours we have 50 images... (about 3 an hour):
(base) [wmoore@idrftp-ftp ~]$ ls -alh /data/ngff/idr0091
...
drwxrwxr-x. 6 wmoore wmoore 100 Jul 11 21:50 20151218_switch8h_pos2_GL02.pattern.ome.zarr
drwxrwxr-x. 6 wmoore wmoore 100 Jul 11 22:16 20151218_switch8h_pos2_GL04.pattern.ome.zarr
drwxrwxr-x. 6 wmoore wmoore 100 Jul 11 22:44 20151218_switch8h_pos2_GL05.pattern.ome.zarr
drwxrwxr-x. 6 wmoore wmoore 100 Jul 11 23:07 20151218_switch8h_pos5_GL03.pattern.ome.zarr
drwxrwxr-x. 6 wmoore wmoore 100 Jul 11 23:32 20151218_switch8h_pos5_GL05.pattern.ome.zarr
drwxrwxr-x. 6 wmoore wmoore 100 Jul 11 23:52 20151218_switch8h_pos5_GL06.pattern.ome.zarr
drwxrwxr-x. 6 wmoore wmoore 100 Jul 12 00:18 20151218_switch8h_pos5_GL08.pattern.ome.zarr
drwxrwxr-x. 3 wmoore wmoore 42 Jul 12 00:18 20151218_switch8h_pos5_GL09.pattern.ome.zarr
Moved 51 zarrs to batch1
and rename image.pattern.ome.zarr
to image.ome.zarr
...
(base) [wmoore@idrftp-ftp batch1]$ for i in $(ls .); do mv $i `echo $i | sed 's/pattern.ome.zarr$/ome.zarr/'`; done
# zip, with -move
(base) [wmoore@idrftp-ftp batch1]$ for i in */; do zip -mr "${i%/}.zip" "$i"; done
Created s3 bucket for testing...
$ aws --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3 mb s3://idr0091
make_bucket: idr0091
$ aws --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3api put-bucket-policy --bucket idr0091 --policy file://policy.json
$ aws --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3api put-bucket-cors --bucket idr0091 --cors-configuration file://cors.json
$ ./mc cp -r /data/ngff/idr0091/20151218_switch8h_pos6_GL01.pattern.ome.zarr uk1s3/idr0091/zarr
...pattern.ome.zarr/3/99/2/0/0: 574.64 MiB / 574.64 MiB ━━━━━━━━━━━━━━━━━━ 38.54 MiB/s 14s
Zipping of 51 images in batch1
above only took an hour.
Upload to BioStudies...
sudo /root/.aspera/cli/bin/ascp -P33001 -i /root/.aspera/cli/etc/asperaweb_id_dsa.openssh -d /data/ngff/idr0091/batch1/idr0091 bsaspera_w@hx-fasp-1.ebi.ac.uk:5f/xxxxxx
...
20151218_switch8h_pos5_GL13.ome.zarr.zip 100% 433MB 487Mb/s 06:04
20151218_switch8h_pos5_GL14.ome.zarr.zip 100% 433MB 323Mb/s 06:12
Completed: 22054051K bytes transferred in 372 seconds
(484481K bits/sec), in 51 files, 1 directory.
# deleted
$ rm -rf batch1/
Other 49 images from batch 1 completed... Zipping..
Also starting to export ALL the remaining images...
for id in 10648206 10648207 10648208 10648209 10648210 10648211 10648212 10648213 10648214 10648215 10648216 10648217 10648218 10648219 10648220 10648221 10648222 10648223 10648224 10648225 10648226 10648227 10648228 10648229 10648230 10648231 10648232 10648233 10648234 10648235 10648236 10648237 10648238 10648239 10648240 10648241 10648242 10648243 10648244 10648245 10648246 10648247 10648248 10648249 10648250 10648251 10648252 10648253 10648254 10648255 10648256 10648257 10648258 10648259 10648260 10648261 10648262 10648263 10648264 10648265 10648266 10648267 10648268 10648269 10648270 10648271 10648272 10648273 10648274 10648275 10648276 10648277 10648278 10648279 10648280 10648281 10648282 10648283 10648284 10648285 10648286 10648287 10648288 10648289 10648290 10648291 10648292 10648293 10648294 10648295 10648296 10648297 10648298 10648299 10648300 10648301 10648302 10648303 10648304 10648305 10648306 10648307 10648348 10648349 10648350 10648351 10648352 10648353 10648354 10648355 10648356 10648357 10648358 10648359 10648360 10648361 10648362 10648363 10648364 10648365 10648366 10648367 10648368 10648369 10648370 10648371 10648372 10648373 10648374 10648375 10648376 10648377 10648378 10648379 10648380 10648381 10648382 10648383 10648384 10648385 10648386 10648387 10648388 10648389 10648390 10648391 10648392 10648393 10648394 10648395 10648396 10648397 10648398 10648399 10648400 10648401 10648402 10648403 10648404 10648405 10648406 10648407 10648408 10648409 10648410 10648411 10648412 10648413 10648414 10648699 10648700 10648701 10648702 10648703 10648704 10648705 10648706 10648707 10648708 10648709 10648710 10648711 10648712 10648713 10648714 10648715 10648716 10648717 10648718 10648719 10648720 10648721 10648722 10648723 10648724 10648725 10648726 10648727 10648728 10648729 10648730 10648731 10648732 10648733 10648734 10648735 10648736 10648737 10648738 10648739 10648740 10648741 10648742 10648743 10648744 10648745 10648746 10648747 10648748 10648749 10648750 10648751 10648752 10648753 10648754 10648755 10648756 10648757 10648758 10648759 10648760 10648761 10648762 10648763 10648764 10648765 10648766 10648767 10648768 10648769 10648770 10648771; do
omero zarr export Image:$id --name_by name;
done
Looks like the last 2 images here (batch1) didn't export properly - too small:
(base) [wmoore@idrftp-ftp idr0091]$ ls -alh
...
-rw-rw-r--. 1 wmoore wmoore 436M Jul 13 03:49 20160912_Pos0_GL11.pattern.ome.zarr.zip
-rw-rw-r--. 1 wmoore wmoore 437M Jul 13 03:49 20160912_Pos0_GL12.pattern.ome.zarr.zip
drwxrwxr-x. 3 wmoore wmoore 42 Jul 13 00:49 20160912_Pos0_GL14.pattern.ome.zarr
-rw-rw-r--. 1 wmoore wmoore 2.8M Jul 13 05:13 20160912_Pos0_GL14.pattern.ome.zarr.zip
drwxrwxr-x. 3 wmoore wmoore 42 Jul 13 00:49 20160912_Pos0_GL15.pattern.ome.zarr
-rw-rw-r--. 1 wmoore wmoore 427K Jul 13 05:14 20160912_Pos0_GL15.pattern.ome.zarr.zip
Deleted them.
Rename 49 others (remove .pattern
) and zip..
for i in $(ls .); do mv $i `echo $i | sed 's/pattern.ome.zarr$/ome.zarr/'`; done
(base) [wmoore@idrftp-ftp idr0091]$ ls
20151218_switch8h_pos6_GL01.ome.zarr 20160526_pos0_GL12.ome.zarr 20160526_pos0_GL26.ome.zarr 20160526_pos4_GL20.ome.zarr 20160912_Pos0_GL02.ome.zarr
20151218_switch8h_pos6_GL03.ome.zarr 20160526_pos0_GL13.ome.zarr 20160526_pos4_GL01.ome.zarr 20160526_pos4_GL21.ome.zarr 20160912_Pos0_GL03.ome.zarr
20151218_switch8h_pos6_GL04.ome.zarr 20160526_pos0_GL16.ome.zarr 20160526_pos4_GL03.ome.zarr 20160526_pos4_GL24.ome.zarr 20160912_Pos0_GL04.ome.zarr
20151218_switch8h_pos6_GL05.ome.zarr 20160526_pos0_GL17.ome.zarr 20160526_pos4_GL06.ome.zarr 20160526_pos4_GL25.ome.zarr 20160912_Pos0_GL05.ome.zarr
20151218_switch8h_pos6_GL06.ome.zarr 20160526_pos0_GL18.ome.zarr 20160526_pos4_GL09.ome.zarr 20160526_pos4_GL27.ome.zarr 20160912_Pos0_GL06.ome.zarr
20151218_switch8h_pos6_GL07.ome.zarr 20160526_pos0_GL19.ome.zarr 20160526_pos4_GL10.ome.zarr 20160526_pos5_GL03.ome.zarr 20160912_Pos0_GL07.ome.zarr
20151218_switch8h_pos6_GL09.ome.zarr 20160526_pos0_GL21.ome.zarr 20160526_pos4_GL11.ome.zarr 20160526_pos5_GL09.ome.zarr 20160912_Pos0_GL10.ome.zarr
20151218_switch8h_pos6_GL10.ome.zarr 20160526_pos0_GL22.ome.zarr 20160526_pos4_GL12.ome.zarr 20160526_pos5_GL12.ome.zarr 20160912_Pos0_GL11.ome.zarr
20160526_pos0_GL01.ome.zarr 20160526_pos0_GL23.ome.zarr 20160526_pos4_GL17.ome.zarr 20160526_pos5_GL13.ome.zarr 20160912_Pos0_GL12.ome.zarr
20160526_pos0_GL05.ome.zarr 20160526_pos0_GL24.ome.zarr 20160526_pos4_GL19.ome.zarr 20160912_Pos0_GL01.ome.zarr
Upload the 2nd lot of 49 images from batch1...
sudo /root/.aspera/cli/bin/ascp -P33001 -i /root/.aspera/cli/etc/asperaweb_id_dsa.openssh -d /data/ngff/idr0091 bsaspera_w@hx-fasp-1.ebi.ac.uk:5f/xxxxxx
...
20160912_Pos0_GL03.ome.zarr.zip 100% 437MB 309Mb/s 07:53
20160912_Pos0_GL10.ome.zarr.zip 100% 436MB 171Mb/s 08:05
Completed: 19024387K bytes transferred in 485 seconds
(320849K bits/sec), in 49 files, 1 directory.
Current progress....
Exported 127 of 342 Images.
(342 - 127) / 3 = 72 hours.
First batch of 100 images (2 failed and need re-exporting).
2nd batch of 242 images is running on idr-ftp
server, into:
(base) [wmoore@idrftp-ftp ngff]$ ls -alh /data/ngff/idr0091_batch2/
total 4.0K
drwxrwxr-x. 29 wmoore wmoore 4.0K Jul 13 09:10 .
drwxr-xr-x. 9 wmoore root 208 Jul 13 00:52 ..
drwxrwxr-x. 6 wmoore wmoore 100 Jul 13 01:11 20160912_Pos0_GL14.pattern.ome.zarr
drwxrwxr-x. 6 wmoore wmoore 100 Jul 13 01:29 20160912_Pos0_GL15.pattern.ome.zarr
drwxrwxr-x. 6 wmoore wmoore 100 Jul 13 01:48 20160912_Pos0_GL16.pattern.ome.zarr
...
...and this should complete in 3 days.
Looks like all remaining zarrs exported OK...
$ ls /data/ngff/idr0091_batch2/ | wc
242 242 9327
rename to remove .pattern
and zip...
$ screen -r idr0091_zip
$ cd /data/ngff/idr0091_batch2/
$ for i in $(ls .); do mv $i `echo $i | sed 's/pattern.ome.zarr$/ome.zarr/'`; done
$ for i in */; do zip -mr "${i%/}.zip" "$i"; done
Started uploading 242 zips...
$ screen -r idr0091_aspera
$ sudo /root/.aspera/cli/bin/ascp -P33001 -i /root/.aspera/cli/etc/asperaweb_id_dsa.openssh -d /data/ngff/idr0091_batch2/idr0091 bsaspera_w@hx-fasp-1.ebi.ac.uk:5f/****
Checked size of zips on BioStudies. 20160912_Pos4_GL06.ome.zarr.zip is smaller than others - as this is only single timepoint: https://idr.openmicroscopy.org/webclient/?show=image-10648217
Use JS to list files from submissions page:
let names = [];
[].forEach.call(document.querySelectorAll("div [role='row'] .ag-cell[col-id='name']"), function(div) {
names.push(div.innerHTML.trim());
});
console.log(names.join("\nidr0091/"));
console.log(names.length);
https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/pages/S-BIAD852.html
Currently 11 out of 342 filesets "viewable"...
idr0091/20161212_Pos0_GL14.ome.zarr,S-BIAD852/0008e8fc-721f-4465-8ff2-bebcce8bca8a,4053448
idr0091/20161014_Pos3_GL05.ome.zarr,S-BIAD852/034b03ca-44b1-4751-9a25-2cafc666750f,4053364
idr0091/20151218_switch8h_pos5_GL12.ome.zarr,S-BIAD852/0a1ff011-a78f-4b11-b8f5-c24ffd0972f6,4053188
idr0091/20161014_Pos3_GL20.ome.zarr,S-BIAD852/1590ce8e-d80b-421d-a728-dc3701c17b93,4053368
idr0091/20160912_Pos8_GL14.ome.zarr,S-BIAD852/69c35c34-b0fb-44b5-8a9a-dbfc30434db3,4053332
idr0091/20160912_Pos0_GL25.ome.zarr,S-BIAD852/990a3f05-0b94-4b2e-add3-ce12c2cdc107,4053306
idr0091/20160912_Pos0_GL23.ome.zarr,S-BIAD852/aca1c44e-1ddc-4c27-9964-364fed5557ea,4053305
idr0091/20160912_Pos4_GL08.ome.zarr,S-BIAD852/b73dad00-b81e-4ac2-a6bf-9f88a6a68918,4053312
idr0091/20161207_Pos0_GL05.ome.zarr,S-BIAD852/bb423818-e307-43d4-8dff-52fe9c918f95,4053837
idr0091/20161212_Pos0_GL20.ome.zarr,S-BIAD852/d4df333b-225f-4e34-bcca-3645118177e2,4053452
idr0091/20161207_Pos1_GL06.ome.zarr,S-BIAD852/de82a935-3143-4ce3-9439-9ab986237b09,4053851
On idr0125-pilot...
...
BEGIN
mkngff_fileset
----------------
5287493
(1 row)
COMMIT
BEGIN
mkngff_fileset
----------------
5287494
(1 row)
COMMIT
BEGIN
mkngff_fileset
----------------
5287495
(1 row)
COMMIT
BEGIN
mkngff_fileset
----------------
5287496
(1 row)
COMMIT
BEGIN
mkngff_fileset
----------------
5287497
(1 row)
COMMIT
Find and view updated image...
$ psql -U omero -d idr -h $DBHOST -c "select id from image where fileset = 5287497"
id
----------
10648757
(1 row)
Fileset file list in webclient looks good:
demo_2/Blitz-0-Ice.ThreadPool.Server-16/2020-10/03/18-15-40.837_mkngff/de82a935-3143-4ce3-9439-9ab986237b09.zarr/3/.zarray
demo_2/Blitz-0-Ice.ThreadPool.Server-16/2020-10/03/18-15-40.837_mkngff/de82a935-3143-4ce3-9439-9ab986237b09.zarr/3
demo_2/Blitz-0-Ice.ThreadPool.Server-16/2020-10/03/18-15-40.837_mkngff/de82a935-3143-4ce3-9439-9ab986237b09.zarr/2/.zarray
demo_2/Blitz-0-Ice.ThreadPool.Server-16/2020-10/03/18-15-40.837_mkngff/de82a935-3143-4ce3-9439-9ab986237b09.zarr/2
demo_2/Blitz-0-Ice.ThreadPool.Server-16/2020-10/03/18-15-40.837_mkngff/de82a935-3143-4ce3-9439-9ab986237b09.zarr/1/.zarray
demo_2/Blitz-0-Ice.ThreadPool.Server-16/2020-10/03/18-15-40.837_mkngff/de82a935-3143-4ce3-9439-9ab986237b09.zarr/1
demo_2/Blitz-0-Ice.ThreadPool.Server-16/2020-10/03/18-15-40.837_mkngff/de82a935-3143-4ce3-9439-9ab986237b09.zarr/0/.zarray
demo_2/Blitz-0-Ice.ThreadPool.Server-16/2020-10/03/18-15-40.837_mkngff/de82a935-3143-4ce3-9439-9ab986237b09.zarr/0
demo_2/Blitz-0-Ice.ThreadPool.Server-16/2020-10/03/18-15-40.837_mkngff/de82a935-3143-4ce3-9439-9ab986237b09.zarr/.zgroup
demo_2/Blitz-0-Ice.ThreadPool.Server-16/2020-10/03/18-15-40.837_mkngff/de82a935-3143-4ce3-9439-9ab986237b09.zarr/.zattrs
But viewing image fails, with a reference to the previous Fileset .pattern
file! :
message = Error instantiating pixel buffer: /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-16/2020-10/03/18-15-40.837/20200817-pattern/20161207_Pos1_GL06.pattern
Looks like the pixels
hasn't been updated for this image:
idr=> select path, name from pixels where image = 10648757;
path | name
----------------------------------------------------------------------------------+----------------------------
demo_2/Blitz-0-Ice.ThreadPool.Server-16/2020-10/03/18-15-40.837/20200817-pattern | 20161207_Pos1_GL06.pattern
The sql doesn't contain OME/METADATA.ome.xml
...
4053851.sql
begin;
select mkngff_fileset(
4053851,
'22c41bb8-36e5-4386-9825-179b180d8238',
'cdf35825-def1-4580-8d0b-9c349b8f78d6',
'demo_2/Blitz-0-Ice.ThreadPool.Server-16/2020-10/03/18-15-40.837_mkngff/',
array[
['demo_2/Blitz-0-Ice.ThreadPool.Server-16/2020-10/03/18-15-40.837_mkngff/de82a935-3143-4ce3-9439-9ab986237b09.zarr/', '.zattrs', 'application/octet-stream'],
['demo_2/Blitz-0-Ice.ThreadPool.Server-16/2020-10/03/18-15-40.837_mkngff/de82a935-3143-4ce3-9439-9ab986237b09.zarr/', '.zgroup', 'application/octet-stream'],
['demo_2/Blitz-0-Ice.ThreadPool.Server-16/2020-10/03/18-15-40.837_mkngff/de82a935-3143-4ce3-9439-9ab986237b09.zarr/', '0', 'Directory'],
['demo_2/Blitz-0-Ice.ThreadPool.Server-16/2020-10/03/18-15-40.837_mkngff/de82a935-3143-4ce3-9439-9ab986237b09.zarr/0/', '.zarray', 'application/octet-stream'],
['demo_2/Blitz-0-Ice.ThreadPool.Server-16/2020-10/03/18-15-40.837_mkngff/de82a935-3143-4ce3-9439-9ab986237b09.zarr/', '1', 'Directory'],
['demo_2/Blitz-0-Ice.ThreadPool.Server-16/2020-10/03/18-15-40.837_mkngff/de82a935-3143-4ce3-9439-9ab986237b09.zarr/1/', '.zarray', 'application/octet-stream'],
['demo_2/Blitz-0-Ice.ThreadPool.Server-16/2020-10/03/18-15-40.837_mkngff/de82a935-3143-4ce3-9439-9ab986237b09.zarr/', '2', 'Directory'],
['demo_2/Blitz-0-Ice.ThreadPool.Server-16/2020-10/03/18-15-40.837_mkngff/de82a935-3143-4ce3-9439-9ab986237b09.zarr/2/', '.zarray', 'application/octet-stream'],
['demo_2/Blitz-0-Ice.ThreadPool.Server-16/2020-10/03/18-15-40.837_mkngff/de82a935-3143-4ce3-9439-9ab986237b09.zarr/', '3', 'Directory'],
['demo_2/Blitz-0-Ice.ThreadPool.Server-16/2020-10/03/18-15-40.837_mkngff/de82a935-3143-4ce3-9439-9ab986237b09.zarr/3/', '.zarray', 'application/octet-stream']
]::text[][]
);
commit;
@joshmoore I see from https://github.com/IDR/omero-mkngff/blob/4c1e32bb32a7b92f427634630e6b552cbb186509/src/omero_mkngff/__init__.py#L108 that mkngff
expects to find a METADATA.xml
with which to update the pixels
table, but in the case of omero-cli-zarr
-exported NGFF data, we don't have METADATA.xml
, so the pixels table won't get updated, leading to the errors above.
We'll need to pick another file to update the pixels table with.
I'll open an issue on the repo: https://github.com/IDR/omero-mkngff/issues/7
Running this sql fixes the image
UPDATE pixels SET name = '.zattrs', path = 'demo_2/Blitz-0-Ice.ThreadPool.Server-16/2020-10/03/18-15-40.837_mkngff/de82a935-3143-4ce3-9439-9ab986237b09.zarr' where image in (select id from Image where fileset = 5287497);
Actually, it seems that Bio-Formats is not fussy which file is referenced in pixels
table.
After this, the image is still viewable...
idr=> UPDATE pixels SET name = '.zarray', path = 'demo_2/Blitz-0-Ice.ThreadPool.Server-16/2020-10/03/18-15-40.837_mkngff/de82a935-3143-4ce3-9439-9ab986237b09.zarr/3' where image in (select id from Image where fileset = 5287497);
We now have all 342 Filesets available at https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/pages/S-BIAD852.html
Lets use next batch (not first 11 above) for testing https://github.com/IDR/omero-mkngff/pull/8
Testing on idr0138-pilot
this time...
Update to branch
conda activate mkngff
pip uninstall omero-mkngff
pip install 'omero-mkngff @ git+https://github.com/will-moore/omero-mkngff@always_update_pixels'
idr0091/20160912_Pos8_GL21.ome.zarr,S-BIAD852/03e12e59-d0cd-456a-99fa-c55dba56b029,4053336
idr0091/20160526_pos5_GL03.ome.zarr,S-BIAD852/040d0262-cf47-4ddd-b5c7-cad13bf98ada,4053438
idr0091/20161130_switch_IPTG1uM_Pos0_GL06.ome.zarr,S-BIAD852/043c117e-1b42-4691-88e6-87f0bd67917d,4053797
idr0091/20161021_Pos5_GL04.ome.zarr,S-BIAD852/053569d5-6ca3-40ec-a1f0-ba163109cc0f,4053499
idr0091/20151218_switch8h_pos5_GL13.ome.zarr,S-BIAD852/057a0a1c-96d1-4cc5-8e4f-c63ce4961080,4053189
idr0091/20160526_pos4_GL21.ome.zarr,S-BIAD852/058b1fac-f751-48d1-8e54-65ce179e1bdb,4053434
idr0091/20161007_Pos0_GL05.ome.zarr,S-BIAD852/05eb785a-9989-4e93-a18d-adf6dd60615b,4053374
idr0091/20161212_Pos0_GL19.ome.zarr,S-BIAD852/07608c5c-ea6d-4e93-9443-efe56fc27ea0,4053451
idr0091/20151204_switch6h_pos0_GL10.ome.zarr,S-BIAD852/07cabbd4-5946-4cf7-ba0f-2b29b60f1184,4053146
idr0091/20160912_Pos4_GL12.ome.zarr,S-BIAD852/08f9303d-b58d-49f9-9655-b858d7218443,4053316
Took about 8 minutes to generate each sql
file...
...
BEGIN
mkngff_fileset
----------------
5811622
(1 row)
COMMIT
UPDATE 0
BEGIN
mkngff_fileset
----------------
5811623
(1 row)
COMMIT
UPDATE 0
BEGIN
mkngff_fileset
----------------
5811624
(1 row)
COMMIT
UPDATE 0
BEGIN
mkngff_fileset
----------------
5811625
(1 row)
COMMIT
UPDATE 0
BEGIN
mkngff_fileset
----------------
5811626
(1 row)
COMMIT
UPDATE 0
Find image from last Fileset created and check pixels name, path...
idr=> select id from image where fileset =5811626;
id
----------
10648222
(1 row)
idr=> select name, path from pixels where image = 10648222;
name | path
----------------------------+---------------------------------------------------------------------------------
20160912_Pos4_GL12.pattern | demo_2/Blitz-0-Ice.ThreadPool.Server-2/2020-10/02/23-00-58.921/20200817-pattern
(1 row)
Realise that this didn't work as I've used the OLD Fileset ID to update pixels after the new Fileset is created. Pushed fix to https://github.com/IDR/omero-mkngff/pull/8/commits/231431164e3692298864fcc52fb3f0c663c8f595
Then re-installed...
Try with fresh filesets...
idr0091/20161014_Pos1_GL02.ome.zarr,S-BIAD852/09369079-50e6-486e-9e72-40e7a0eef8ec,4053346
idr0091/20151218_switch8h_pos5_GL12.ome.zarr,S-BIAD852/0a1ff011-a78f-4b11-b8f5-c24ffd0972f6,4053188
idr0091/20151204_switch6h_pos0_GL20.ome.zarr,S-BIAD852/0a812f66-99dd-4280-bb59-7d04f7e75b39,4053152
idr0091/20160912_Pos0_GL16.ome.zarr,S-BIAD852/0a858893-dcb1-40f7-ac4d-86cd80d1587d,4053302
After running sql commands, get Image IDs from Fileset IDs..
idr=> select id from image where fileset in (5811627, 5811628, 5811629, 5811630)
idr-> ;
id
----------
10648252
10648094
10648058
10648208
(4 rows)
Check pixels...
=> select path, name from pixels where image = 10648252;
path | name
------------------------------------------------------------------------------------------------------------------+---------
demo_2/Blitz-0-Ice.ThreadPool.Server-12/2020-10/03/02-02-34.667_mkngff/09369079-50e6-486e-9e72-40e7a0eef8ec.zarr | .zattrs
Image is directly viewable!
Going to generate mkngff sql
on ALL Filesets on idr0125-pilot. https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/pages/S-BIAD852.html
Above tests were run on idr0138-pilot, so DB doesn't have original Fileset IDs now).
for r in $(cat $IDRID.csv); do
biapath=$(echo $r | cut -d',' -f2)
uuid=$(echo $biapath | cut -d'/' -f2)
fsid=$(echo $r | cut -d',' -f3)
omero mkngff sql --symlink_repo /data/OMERO/ManagedRepository --secret=$SECRET $fsid "/bia-integrator-data/$biapath/$uuid.zarr" >> "$IDRID/$fsid.sql"
psql -U omero -d idr -h $DBHOST -f "$IDRID/$fsid.sql"
done
NB: First 10 failed sql as had already been run on idr0125-pilot above - Need to sort out...
... took 25 mins in total.
Also saw another random fail for
idr0091/20161207_Pos1_GL06.ome.zarr,S-BIAD852/de82a935-3143-4ce3-9439-9ab986237b09,4053851
just caught this...
ERROR: duplicate key value violates unique constraint "originalfile_repo_path_index"
DETAIL: Key (repo, regexp_split_to_array((('/'::text || path) || name) || '/'::text, '/+'::text))=(cdf35825-def1-4580-8d0b-9c349b8f78d6, {"",demo_2,Blitz-0-Ice.ThreadPool.Server-16,2020-10,03,18-15-40.837_mkngff,de82a935-3143-4ce3-9439-9ab986237b09.zarr,.zattrs,""}) already exists.
CONTEXT: SQL statement "insert into originalfile
(id, permissions, creation_id, group_id, owner_id, update_id, mimetype, repo, path, name)
values (nextval('seq_originalfile'), old_perms, new_event, old_group, old_owner, new_event,
info[i][3], repo, info[i][1], uuid || info[i][2])
returning id"
PL/pgSQL function mkngff_fileset(bigint,character varying,character varying,character varying,text[]) line 42 at SQL statement
ROLLBACK
Re-exporting on idr-ftp with pixels type fix as at https://github.com/ome/omero-cli-zarr/pull/157 with merge branch
pip install 'omero-cli-zarr @ git+https://github.com/will-moore/omero-cli-zarr@merge_prs'
omero login
for id in 10648047 10648048 10648049 10648050 10648051 10648052 10648053 10648054 10648055 10648056 10648057 10648058 10648059 10648060 10648061 10648062 10648063 10648064 10648065 10648066 10648067 10648068 10648069 10648070 10648071 10648072 10648073 10648074 10648075 10648076 10648077 10648078 10648079 10648080 10648081 10648082 10648083 10648084 10648085 10648086 10648087 10648088 10648089 10648090 10648091 10648092 10648093 10648094 10648095 10648096 10648097 10648098 10648099 10648100 10648101 10648102 10648103 10648104 10648317 10648318 10648319 10648320 10648321 10648322 10648323 10648324 10648325 10648326 10648327 10648328 10648329 10648330 10648331 10648332 10648333 10648334 10648335 10648336 10648337 10648338 10648339 10648340 10648341 10648342 10648343 10648344 10648345 10648346 10648347 10648196 10648197 10648198 10648199 10648200 10648201 10648202 10648203 10648204 10648205; do
echo $id;
omero zarr export Image:$id --name_by name;
done
Also exported "batch2" as above...
Renamed ALL 342 filesets to remove pattern
for i in $(ls .); do mv $i `echo $i | sed 's/pattern.ome.zarr$/ome.zarr/'`; done
Zip - not deleting...
$ for i in */; do zip -r "${i%/}.zip" "$i"; done
on idr-testing... (goofys is at /usr/bin/goofys
)...
sudo mkdir /idr0091 && sudo /usr/bin/goofys --endpoint https://uk1s3.embassy.ebi.ac.uk/ -o allow_other idr0091 /idr0091
(base) [wmoore@test120-omeroreadwrite ~]$ ls /idr0091/zarr/
20151218_switch8h_pos6_GL01.pattern.ome.zarr
On idr-ftp, delete the existing (invalid) data and upload all images...
./mc rm --recursive uk1s3/idr0091/zarr/20151218_switch8h_pos6_GL01.pattern.ome.zarr
./mc cp -r /data/ngff/idr0091/idr0091/ uk1s3/idr0091/zarr
..._Pos1_GL26.ome.zarr/3/99/2/0/0: 96.73 GiB / 96.73 GiB ━━━━━━━━━
idr-testing...
(base) [wmoore@test120-omeroreadwrite ~]$ ls /idr0091/zarr/ | wc
342 342 10722
E.g. looks good: https://ome.github.io/ome-ngff-validator/?source=https://uk1s3.embassy.ebi.ac.uk/idr0091/zarr/20160526_pos0_GL01.ome.zarr
On idr-testing, let's try to update symlink to fix dtype issues...
Test with Image: 20151204_switch6h_pos0_GL01.pattern
, ID: 10648046
...
Existing failure:
$ python check_pixels.py --max-planes=sizeC Image:10648046
Start: 2024-01-04 22:12:48.846978
Checking Image:10648046
max_planes: sizeC
max_images: 0
0/1 Check Image:10648046 20151204_switch6h_pos0_GL01.pattern
ERROR:omero.gateway:Failed to getPlane() or getTile() from rawPixelsStore
Traceback (most recent call last):
File "/opt/omero/server/venv3/lib64/python3.6/site-packages/omero/gateway/__init__.py", line 7542, in getTiles
convertedPlane = unpack(convertType, rawPlane)
struct.error: unpack requires a buffer of 174528 bytes
That Image has symlink like this:
(venv3) (base) [wmoore@test120-omeroreadwrite scripts]$ ls -alh !$
ls -alh /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-20/2020-10/02/14-41-08.138_mkngff/
total 8.0K
drwxr-sr-x. 2 omero-server omero-server 126 Nov 1 15:40 .
drwxrwsr-x. 22 omero-server omero-server 4.0K Oct 11 09:49 ..
lrwxrwxrwx. 1 omero-server omero-server 109 Oct 11 09:41 971f2809-c748-4259-8044-81ba6c774fdd.zarr -> /bia-integrator-data/S-BIAD852/971f2809-c748-4259-8044-81ba6c774fdd/971f2809-c748-4259-8044-81ba6c774fdd.zarr
-rw-r--r--. 1 omero-server omero-server 25 Nov 1 15:40 971f2809-c748-4259-8044-81ba6c774fdd.zarr.bfoptions
As omero-server...
rm /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-20/2020-10/02/14-41-08.138_mkngff/971f2809-c748-4259-8044-81ba6c774fdd.zarr
$ ln -s /idr0091/zarr/20151204_switch6h_pos0_GL01.ome.zarr /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-20/2020-10/02/14-41-08.138_mkngff/971f2809-c748-4259-8044-81ba6c774fdd.zarr
Symlink looks good:
$ ls -alh /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-20/2020-10/02/14-41-08.138_mkngff/
total 8.0K
drwxr-sr-x. 2 omero-server omero-server 126 Jan 4 22:32 .
drwxrwsr-x. 22 omero-server omero-server 4.0K Oct 11 09:49 ..
lrwxrwxrwx. 1 omero-server omero-server 50 Jan 4 22:32 971f2809-c748-4259-8044-81ba6c774fdd.zarr -> /idr0091/zarr/20151204_switch6h_pos0_GL01.ome.zarr
-rw-r--r--. 1 omero-server omero-server 25 Nov 1 15:40 971f2809-c748-4259-8044-81ba6c774fdd.zarr.bfoptions
Fixed!
$ python check_pixels.py --max-planes=sizeC Image:10648046
Start: 2024-01-04 22:37:35.901700
Checking Image:10648046
max_planes: sizeC
max_images: 0
0/1 Check Image:10648046 20151204_switch6h_pos0_GL01.pattern
End: 2024-01-04 22:38:05.999497
We can actually use https://github.com/IDR/idr-utils/pull/54 script to do this, if we provide mapping.csv
Test with a single Image on idr-testing...
ID: 10648047
, Name 20151204_switch6h_pos0_GL02.pattern
mapping.csv (existing symlink -> new target)
f12bdada-57eb-4fab-90ef-9655e4106497.zarr,20151204_switch6h_pos0_GL02.ome.zarr
As omero-server...
$ echo f12bdada-57eb-4fab-90ef-9655e4106497.zarr,20151204_switch6h_pos0_GL02.ome.zarr > idr0091_symlinks.csv
login as public user, then..
$ python /uod/idr/metadata/idr-utils/scripts/managed_repo_symlinks.py Image:10648047 /idr0091/zarr/ --repo /data/OMERO/ManagedRepository --fileset-mappings idr0091_symlinks.csv --report
fileset_dirs {'f12bdada-57eb-4fab-90ef-9655e4106497.zarr': '20151204_switch6h_pos0_GL02.ome.zarr'}
Fileset: 6314412 /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-20/2020-10/02/14-46-33.031_mkngff/
Render Image 10648047
fs_contents ['f12bdada-57eb-4fab-90ef-9655e4106497.zarr', 'f12bdada-57eb-4fab-90ef-9655e4106497.zarr.bfoptions']
Link from /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-20/2020-10/02/14-46-33.031_mkngff/f12bdada-57eb-4fab-90ef-9655e4106497.zarr to /idr0091/zarr/20151204_switch6h_pos0_GL02.ome.zarr
Symlink target not found: /idr0091/zarr/f12bdada-57eb-4fab-90ef-9655e4106497.zarr.bfoptions
Success!
$ python scripts/check_pixels.py Image:10648047 --max-planes=sizeC
Start: 2024-01-05 10:02:34.840789
Checking Image:10648047
max_planes: sizeC
max_images: 0
0/1 Check Image:10648047 20151204_switch6h_pos0_GL02.pattern
End: 2024-01-05 10:02:51.334694
On idr-testing, make idr0091_temp.csv
which is idr0091.csv but modified to remove idr0091/
and S-BIAD
on each row:
20161212_Pos0_GL14.ome.zarr,0008e8fc-721f-4465-8ff2-bebcce8bca8a,4053448
20161212_Pos1_GL19.ome.zarr,0044dd95-07e1-4937-938b-dde53ebbb719,4053473
20161007_Pos0_GL01.ome.zarr,00602c54-e3bd-406c-83fd-a802b58182b0,4053371
...
From that, we can make symlinks mapping file as above:
for r in $(cat idr0091_temp.csv); do
name=$(echo $r | cut -d',' -f1)
uuid=$(echo $r | cut -d',' -f2)
echo "$uuid.zarr,$name" >> idr0091_symlinks.csv
done
Now we run managed_repo_symlinks for each Image...
for r in $(cat idr0091_imageids.csv); do
python /uod/idr/metadata/idr-utils/scripts/managed_repo_symlinks.py Image:$r /idr0091/zarr/ --repo /data/OMERO/ManagedRepository --fileset-mappings idr0091_symlinks.csv --report
done
...
Fileset: 6314331 /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-21/2020-10/02/17-11-44.019_mkngff/
Render Image 10648074
fs_contents ['b80ac5e8-ff4d-4235-aaab-4adfeec0db48.zarr', 'b80ac5e8-ff4d-4235-aaab-4adfeec0db48.zarr.bfoptions']
Link from /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-21/2020-10/02/17-11-44.019_mkngff/b80ac5e8-ff4d-4235-aaab-4adfeec0db48.zarr to /idr0091/zarr/20151204_switch6h_pos5_GL12.ome.zarr
Symlink target not found: /idr0091/zarr/b80ac5e8-ff4d-4235-aaab-4adfeec0db48.zarr.bfoptions
...
EDIT... took about 15 mins to do 342 images...
...
Fileset: 6314271 /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-5/2020-10/03/18-48-59.765_mkngff/
Render Image 10648771
fs_contents ['882f80fa-f40f-455b-b923-09dce086675b.zarr', '882f80fa-f40f-455b-b923-09dce086675b.zarr.bfoptions']
Link from /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-5/2020-10/03/18-48-59.765_mkngff/882f80fa-f40f-455b-b923-09dce086675b.zarr to /idr0091/zarr/20161207_Pos1_GL26.ome.zarr
Symlink target not found: /idr0091/zarr/882f80fa-f40f-455b-b923-09dce086675b.zarr.bfoptions
python /uod/idr/metadata/idr-utils/scripts/check_pixels.py Project:1351 --max-planes=sizeC > /tmp/check_pixels_20240105_idr0091.log
All good 👍
(base) [wmoore@test120-omeroreadwrite ~]$ grep pattern /tmp/check_pixels_20240105_idr0091.log | wc
342 1368 20719
(base) [wmoore@test120-omeroreadwrite ~]$ grep Error /tmp/check_pixels_20240105_idr0091.log | wc
0 0 0
On idr-ftp, the zips created on 18th Dec (above) have been uploaded (not sure of exact date), following deletion of the old idr0091
folder on 16th Jan:
from history...
sudo /root/.aspera/cli/bin/ascp -P33001 -i /root/.aspera/cli/etc/asperaweb_id_dsa.openssh -d /data/ngff/idr0091/idr0091 bsaspera_w@hx-fasp-1.ebi.ac.uk:5f/136e8d-e...
Images updated on https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/pages/S-BIAD852.html
New idr0090.csv file at https://github.com/IDR/mkngff_upgrade_scripts/commit/0522d43a91c9ad506d3526014f46b8015bbd3a1a and https://github.com/IDR/mkngff_upgrade_scripts/commit/c92c21788a49f37fc7212e84c1d08e398b4a3dff based on csv provided by Kola.
Running mkngff on idr-next (since this has the NGFF filesets that we wish to replace), using --fs_suffix=None
so we don't add an extra _mkngff
to Fileset paths.
(venv3) [wmoore@prod120-omeroreadwrite ~]$ git clone https://github.com/IDR/mkngff_upgrade_scripts.git
(venv3) [wmoore@prod120-omeroreadwrite ~]$ cd mkngff_upgrade_scripts/ngff_filesets/
for r in $(cat $IDRID.csv); do
biapath=$(echo $r | cut -d',' -f2)
uuid=$(echo $biapath | cut -d'/' -f2)
fsid=$(echo $r | cut -d',' -f3 | tr -d '[:space:]')
omero mkngff sql $fsid --fs_suffix=None --clientpath="https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/$biapath/$uuid.zarr" "/bia-integrator-data/$biapath/$uuid.zarr" > "$IDRID/$fsid.sql"
done
EDIT: something went wrong as all the .sql files are empty!
Fixed the idr0091.csv (mising S-BIAD852/
from each row. Running again...
Pushed at https://github.com/IDR/mkngff_upgrade_scripts/commit/03b02e72b31e4173803f645c6a226cb97fe13cda
Won't test these yet as idr-testing is being used for microservices testing.
On new pilot https://github.com/IDR/idr-metadata/issues/675#issuecomment-2050137532
Ran all the mkngff SQL scripts... ending for idr0091 with...
...
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-12/2024-02/28/17-06-10.963
Creating dir at /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-12/2024-02/28/17-06-10.963_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-12/2024-02/28/17-06-10.963_mkngff/fdfdbb32-c1c2-4eec-8bbd-ffc3b729958b.zarr -> /bia-integrator-data/S-BIAD852/fdfdbb32-c1c2-4eec-8bbd-ffc3b729958b/fdfdbb32-c1c2-4eec-8bbd-ffc3b729958b.zarr
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-12/2024-02/28/17-06-10.963
write bfoptions to: /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-12/2024-02/28/17-06-10.963_mkngff/fdfdbb32-c1c2-4eec-8bbd-ffc3b729958b.zarr.bfoptions
UPDATE 1
BEGIN
mkngff_fileset
----------------
6319888
(1 row)
COMMIT
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-11/2024-02/28/16-45-47.233
Creating dir at /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-11/2024-02/28/16-45-47.233_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-11/2024-02/28/16-45-47.233_mkngff/fe65c558-7099-48c4-8222-a5dc54da884a.zarr -> /bia-integrator-data/S-BIAD852/fe65c558-7099-48c4-8222-a5dc54da884a/fe65c558-7099-48c4-8222-a5dc54da884a.zarr
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-11/2024-02/28/16-45-47.233
write bfoptions to: /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-11/2024-02/28/16-45-47.233_mkngff/fe65c558-7099-48c4-8222-a5dc54da884a.zarr.bfoptions
UPDATE 1
BEGIN
mkngff_fileset
----------------
6319889
(1 row)
COMMIT
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-6/2024-02/28/17-13-27.461
Creating dir at /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-6/2024-02/28/17-13-27.461_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-6/2024-02/28/17-13-27.461_mkngff/fe795db1-82c3-42b0-bbf8-5c4230bebdc9.zarr -> /bia-integrator-data/S-BIAD852/fe795db1-82c3-42b0-bbf8-5c4230bebdc9/fe795db1-82c3-42b0-bbf8-5c4230bebdc9.zarr
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-6/2024-02/28/17-13-27.461
write bfoptions to: /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-6/2024-02/28/17-13-27.461_mkngff/fe795db1-82c3-42b0-bbf8-5c4230bebdc9.zarr.bfoptions
Last row in idr0091.csv at https://github.com/IDR/mkngff_upgrade_scripts/blob/1b64ab85fab537faafd62d6e19c01cf5ab32d11f/ngff_filesets/idr0091.csv is idr0091/20161212_Pos1_GL04.ome.zarr,S-BIAD852/fe795db1-82c3-42b0-bbf8-5c4230bebdc9,6314392
this image is http://localhost:1080/webclient/?show=image-10648367
and the Fileset ID is 4053461
.
So, the idr0091.csv above is out of date, and was missed from the update at https://github.com/IDR/mkngff_upgrade_scripts/commit/03b02e72b31e4173803f645c6a226cb97fe13cda
Try to clean-up (delete) the 342 Filesets we created above - last one ID 6319889
.
First one ID = 6319548?
idr=> select id from Image where fileset=6319548;
15150680
(1 row)
http://localhost:1080/webclient/?show=image-15150680 in webclient on pilot-idrngff
is a tiff image but has wrong Fileset with 44e015db3952.zarr
which corresponds to the first row of idr0090.csv
.
For all Filesets 6319548 -> 6319889
we want to:
For Last Image/Fileset...
idr=> select child from FilesetAnnotationLink where parent=6319889;
child
----------
38302449
idr=> select longvalue from Annotation where id=38302449;
longvalue
-----------
6314392
This corresponds to the Fileset IDs updated in https://github.com/IDR/mkngff_upgrade_scripts/commit/25c5372c52a250f8565cabd0f904f02f8d56e741
So, NEW Fileset IDs are 6319548 -> 6319889
OLD Fileset IDs are in idr0091.csv before that commit.
First row...
6314330
(from old idr0091.csv), New Fileset ID: 6319548
(to be deleted), Image: 15150680update image set fileset = 6314330 where fileset = 6319548;
for i in {6319548..6319889}; do echo $i > idr0091_ids.csv; done
idr0091_ids.csv (removed first line 6319548,6314330
- already done update above.
NEW Fileset ID, OLD Fileset ID
6319549,6314371
6319550,6314286
6319551,6314139
...
6319887,6314352
6319888,6314232
6319889,6314392
Then
for r in $(cat idr0091_ids.csv); do
newid=$(echo $r | cut -d',' -f1)
oldid=$(echo $r | cut -d',' -f2)
psql -U omero -d idr -h $DBHOST -c "update image set fileset = $oldid where fileset = $newid"
done
for r in $(cat idr0091_ids.csv); do
newid=$(echo $r | cut -d',' -f1)
echo $newid && omero delete Fileset:$newid
done
idr0091-julou-lacinduction