Open will-moore opened 1 year ago
Reimport still in progress - cancelled once because of long wait on FILESET_UPLOAD_PREP. The new import in progress since 8 March, also FILESET_UPLOAD_PREP (with parallel-upload=10)
As discussed today, it is probably worth to try and import without chunks, then to add the chunks back by sym-linking to the full plate from the ManagedRepository.
This workflow has allowed me to import big plates from idr0125. In that case, I created a "metadata only" plate (no chunks) by downloading from s3 using a sync command that ignored chunks.
In a single-image case, I recently achieved the same thing by making a copy of the NGFF Image, then deleting chunks
by deleing files by name, E.g. all files named "0"
: https://github.com/IDR/idr-metadata/issues/652#issuecomment-1491814772
If you only have files named "0"
or "1"
or "2"
you will have to delete each in turn, although there is probably a way to do it in 1 command?
Then, import the metadata only Plate. E.g. for idr0125 - 384-well plate, 9 fields per Well - took ~2 hours.
Then, try to view images in the Plate - they should appear as black.
Then you can delete the metadata-only plate in Managed Repo and replace it with symlink to the full plate.
In the case of idr0125 I was able to do this running https://github.com/IDR/idr0125-way-cellpainting/blob/main/scripts/symlinks.bash as the omero-server
user
sudo -u omero-server -s
symlinks.bash
But on pilot-idrtesting
I needed to use a different user to do the delete and symlinking: https://github.com/IDR/idr-metadata/issues/652#issuecomment-1491814772
Thanks @will-moore
ls -lah LT0008_31.ome.zarr/
total 36K
drwxrwxr-x. 19 dlindner dlindner 191 Feb 16 12:26 .
drwxrwxr-x. 3 dlindner dlindner 94 Feb 16 12:02 ..
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 A
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 B
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 C
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 D
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 E
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 F
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 G
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 H
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 I
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 J
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 K
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 L
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 M
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 N
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 O
drwxrwxr-x. 2 dlindner dlindner 60 Feb 16 12:02 OME
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 P
-rw-rw-r--. 1 dlindner dlindner 31K Feb 16 12:26 .zattrs
-rw-rw-r--. 1 dlindner dlindner 23 Feb 16 12:02 .zgroup
So would you recommend to delete all the A-P files ?
No, those A-P are directories that contain important files etc. You only want to delete the chunks, which are files named 0, 1 etc.
You can list them with e.g.
find -type f -name '0'
count them:
find -type f -name '0' | wc
And only delete the chunks from a copy of the Plate - Don't delete the originals.
Delete chunks with e.g:
sudo find -type f -name '0' -delete
After having done the workflow suggested by @will-moore I have no imports found
response. I have deleted the
sudo find -type f -name '0' -delete
sudo find -type f -name '1' -delete
Then tried
omero import --parallel-upload=10 --transfer=ln_s --skip=all --depth 10 --name "idr0013-nochunks" /data/ngff/idr0013/LT0008_31.ome.zarr-copy/OME/METADATA.ome.xml --file /tmp/idr0013-nochun.log --errs /tmp/idr0013-nochun.err
omero import --parallel-upload=10 --transfer=ln_s --skip=all --depth 10 --name "idr0013-nochunks" /data/ngff/idr0013/LT0008_31.ome.zarr-copy
)Both attempts above end in no imports found
@pwalczysko it might be that the plate name has to end with .zarr
extension?
Also, I presume that --depth 10
and --depth=10
are the same?
it might be that the plate name has to end with
.zarr
extension?
Indeed, thank you @will-moore , this did the trick. The data are now imported as http://localhost:1080/webclient/?show=plate-253 (idr0013-nochunks
). Also, I have replaced the file in the ManagedRepo as instructed with the symlink to the original chunks and the images in the plate http://localhost:1080/webclient/?show=plate-253 are displaying correctly in iviewer, the timelapse is playing okay too.
Looks great! I adjusted rendering settings and "Saved to all" so the thumbnails are clearer - they all regenerated fine 👍
Try to guess how much space is needed for conversion. Raw data is 8bit (1 byte per pixel), single Z & C timelapse
ScreenA 1344 x 1024 x 93 x 384 x 510 plates = 25TB ScreenB 25 plates (slightly sparse) ~ 1.2 TB
On pilot-zarr2-dev:
Converting one plate takes ~30min, zipping ~50min (without compression 7min!). Converted plate size 36Gb, zipped 28Gb.
7zip (p7zip): 5min (also 28Gb), (without compression 4min)
There are 538 plates in total.
Created batch directories for each 10 plates under /data/ngff/idr0013 . Trying to do 10 conversions and 10 zip/uploads/delete a time, due to the disk space limitation.
For conversion:
cd /data/ngff/idr0013/batch_XX
for i in `cat ../batch_XX.txt`; do ~/bioformats2raw/bin/bioformats2raw --memo-directory ../../memo /uod/idr/metadata/idr0013-neumann-mitocheck/screens/$i ${i%.*}.ome.zarr; done
# Note: The input file batch_XX.txt is one directory up in /data/ngff/idr0013 !
For zipping: Each batch directory contains a zip.sh which zips and deletes the original if successful
cd /data/ngff/idr0013/batch_XX
for i in `ls | grep zarr`; do ./zip.sh $i; done
For upload:
mv *.zip idr0013. # each batch dir already has an empty idr0013 subdir
ascp -P33001 -i ~/.aspera/cli/etc/asperaweb_id_dsa.openssh -d idr0013 bsaspera_w@hx-fasp-1.ebi.ac.uk:<SECRET_DIDR>
Then add zu files.tsv and delete:
ls idr0013 >> ../idr0013_files.tsv
rm idr0013/*.zip
Failing plate:
(base) [dlindner@pilot-zarr2-dev batch_3]$ ~/bioformats2raw/bin/bioformats2raw --memo-directory ../../memo /uod/idr/metadata/idr0013-neumann-mitocheck/screens/LT0012_29--ex2005_06_10--sp2005_04_08--tt16--c3.screen LT0012_29--ex2005_06_10--sp2005_04_08--tt16--c3.ome.zarr
OpenJDK 64-Bit Server VM warning: You have loaded library /tmp/opencv_openpnp3633973597553018286/nu/pattern/opencv/linux/x86_64/libopencv_java342.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
Exception in thread "main" picocli.CommandLine$ExecutionException: Error while calling command (com.glencoesoftware.bioformats2raw.Converter@63a65a25): java.lang.NullPointerException
at picocli.CommandLine.executeUserObject(CommandLine.java:1962)
at picocli.CommandLine.access$1300(CommandLine.java:145)
at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2346)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2311)
at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:2172)
at picocli.CommandLine.parseWithHandlers(CommandLine.java:2550)
at picocli.CommandLine.parseWithHandler(CommandLine.java:2485)
at picocli.CommandLine.call(CommandLine.java:2761)
at com.glencoesoftware.bioformats2raw.Converter.main(Converter.java:2192)
Caused by: java.lang.NullPointerException
at ome.xml.meta.OMEXMLMetadataImpl.getWellSampleImageRef(OMEXMLMetadataImpl.java:5205)
at com.glencoesoftware.bioformats2raw.Converter.hasValidPlate(Converter.java:2055)
at com.glencoesoftware.bioformats2raw.Converter.convert(Converter.java:604)
at com.glencoesoftware.bioformats2raw.Converter.call(Converter.java:516)
at com.glencoesoftware.bioformats2raw.Converter.call(Converter.java:107)
at picocli.CommandLine.executeUserObject(CommandLine.java:1953)
... 9 more
I guess there will be more. I'll start and append to this list here to keep track of them:
Wrapped it all into one script:
#!/bin/bash
# Usage: ./run.sh screens.txt log.txt
# Disable all output
exec 2>&1 1>/dev/null
for i in `cat $1`;
do
date >> $2
echo "Converting $i" >> $2
zarr_file=${i%.*}.ome.zarr
~/bioformats2raw/bin/bioformats2raw --memo-directory /data/ngff/memo /uod/idr/metadata/idr0013-neumann-mitocheck/screens/$i $zarr_file
if [ $? -eq 0 ]
then
echo "Zipping ${zarr_file}" >> $2
7za -mmt8 a ${zarr_file}.zip ${zarr_file}
if [ $? -eq 0 ]
then
rm -rf ${zarr_file}
mv ${zarr_file}.zip idr0013/
echo "Uploading ${zarr_file}.zip" >> $2
ascp -P33001 -i ~/.aspera/cli/etc/asperaweb_id_dsa.openssh -d idr0013 bsaspera_w@hx-fasp-1.ebi.ac.uk:/<SECRET_DIR>
if [ $? -eq 0 ]
then
echo ${zarr_file}.zip >> files.tsv
rm idr0013/${zarr_file}.zip
else
echo "ERR Upload failed." >> $2
fi
else
echo "ERR Zipping failed." >> $2
fi
else
echo "ERR Converting failed." >> $2
fi
done
It's running now in three sessions (screens) in /data/ngff/idr0013_new/run_1 / 2 /3 (there is a run_4 as well, but that might be a bit too much).
This is currently doing 3 conversions in a bit more than an hour. So should all be done in ~8 days.
Finished. Only LT0012_29--ex2005_06_10--sp2005_04_08--tt16--c3.screen
failed conversion (see above).
Really finished now, exported the LT0012_29 plate with omero cli zarr. (LT0012_29.ome.zarr.zip)
Looking into submission error with file names in idr0013_files.tsv
.
Looks like problem is that each row doesn't include the directory with idr0013/...
But I also noticed a zip called LT0012_29.ome.zarr.zip
which looks wrong (different from the others).
Now I see above that this was generated via omero-cli-zarr
so that it matches the Plate name in IDR, whereas all the others have much longer names.
To try and make this consistent with the others, I downloaded it (via web page), renamed it and uploaded via Aspera...
$ ./ascp -P33001 -i ../etc/asperaweb_id_dsa.openssh -d ~/Downloads/LT0012_29--ex2005_06_10--sp2005_04_08--tt16--c3.ome.zarr.zip bsaspera_w@hx-fasp-1.ebi.ac.uk:/5f/136e8d-e575-4755-9ac2-aa7fc10cae67-a26596/idr0013/
Checked on https://www.ebi.ac.uk/biostudies/submissions/files?path=%2Fuser%2Fidr0013 that the file sizes of renamed file matched the old file, then deleted LT0012_29.ome.zarr.zip
.
Upload new idr0013_files.tsv
All 538 Plates now available at https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/pages/S-BIAD865.html
idr0013.csv at https://github.com/IDR/idr-utils/pull/56/commits/cac35aa0d1731afb5db0ab6b60e10bdf03c591fd
$ for r in $(cat $IDRID.csv); do
> biapath=$(echo $r | cut -d',' -f2)
> uuid=$(echo $biapath | cut -d'/' -f2)
> fsid=$(echo $r | cut -d',' -f3)
> omero mkngff sql --symlink_repo /data/OMERO/ManagedRepository --secret=$SECRET $fsid "/bia-integrator-data/$biapath/$uuid.zarr" > "$IDRID/$fsid.sql"
> done
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/2016-05/09 // 05-00-41.632 for fileset 18761
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/09/05-00-41.632
Creating dir at /data/OMERO/ManagedRepository/demo_2/2016-05/09/05-00-41.632_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2016-05/09/05-00-41.632_mkngff/011c38fb-c3d0-4d1d-82d8-9147a5060d88.zarr -> /bia-integrator-data/S-BIAD865/011c38fb-c3d0-4d1d-82d8-9147a5060d88/011c38fb-c3d0-4d1d-82d8-9147a5060d88.zarr
...
Even a day later, the first sql
fileset hadn't completed!
After installing a potential fix https://github.com/IDR/omero-mkngff/pull/11#issuecomment-1727189948 re-ran again...
After 50 minutes, we have done 17 filesets - 3 minutes per Fileset!
mkngff
loop failed with goofys mount: see https://github.com/IDR/idr-metadata/issues/671#issuecomment-1727328137
Needed server restart (so existing sql files are invalid).
Deleted them and restarted, using mkngff latest commit which ignores existing symlinks: https://github.com/IDR/omero-mkngff/pull/11/commits/0e4dca393d6821c1b78d4fd0bac35e7d99abe078
Moved the first 3 .sql files to test dir to run sql while mkngff is still running...
cd idr0013_test
for r in $(ls ./); do
psql -U omero -d idr -h $DBHOST -f "$r"
done
UPDATE 380
BEGIN
mkngff_fileset
----------------
6311999
(1 row)
COMMIT
UPDATE 380
BEGIN
mkngff_fileset
----------------
6312000
(1 row)
COMMIT
UPDATE 380
BEGIN
mkngff_fileset
----------------
6312001
(1 row)
COMMIT
Viewing image from first plate: http://localhost:1080/webclient/?show=image-1613300
Failed with ResourceError. Checked Blitz logs..
2023-09-20 11:23:14,170 DEBUG [ loci.formats.Memoizer] (l.Server-6) start[1695208695185] time[298985] tag[loci.formats.Memoizer.setId]
2023-09-20 11:23:14,171 ERROR [ ome.io.bioformats.BfPixelBuffer] (l.Server-6) Failed to instantiate BfPixelsWrapper with /data/OMERO/ManagedRepository/demo_2/2016-05/09/05-00-41.632_mkngff/011c38fb-c3d0-4d1d-82d8-9147a5060d88.zarr/OME/METADATA.ome.xml
2023-09-20 11:23:14,172 ERROR [ ome.io.nio.PixelsService] (l.Server-6) Error instantiating pixel buffer: /data/OMERO/ManagedRepository/demo_2/2016-05/09/05-00-41.632_mkngff/011c38fb-c3d0-4d1d-82d8-9147a5060d88.zarr/OME/METADATA.ome.xml
java.lang.RuntimeException: java.io.IOException: Path '/bia-integrator-data/S-BIAD865/011c38fb-c3d0-4d1d-82d8-9147a5060d88/011c38fb-c3d0-4d1d-82d8-9147a5060d88.zarr/M/23' is not a valid path or not a directory.
at ome.io.bioformats.BfPixelBuffer.reader(BfPixelBuffer.java:79)
at ome.io.bioformats.BfPixelBuffer.setSeries(BfPixelBuffer.java:124)
at ome.io.nio.PixelsService.createBfPixelBuffer(PixelsService.java:898)
/M/23
is a missing Well for this plate, so we shouldn't be trying to read from that dir.
Viewing a different Plate from idr0004 with missing Wells gives same error:
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.io.IOException: Path '/bia-integrator-data/S-BIAD867/103d9428-b86b-4f4e-84d8-966b5d89aae1/103d9428-b86b-4f4e-84d8-966b5d89aae1.zarr/A/1' is not a valid path or not a directory.
at com.bc.zarr.ZarrUtils.ensureDirectory(ZarrUtils.java:158)
at com.bc.zarr.ZarrGroup.open(ZarrGroup.java:95)
at com.bc.zarr.ZarrGroup.open(ZarrGroup.java:88)
To see if a non-sparse Plate would work, updated
$ psql -U omero -d idr -h $DBHOST -f 18460.sql
UPDATE 384
BEGIN
mkngff_fileset
----------------
6312002
(1 row)
COMMIT
http://localhost:1080/webclient/?show=well-802140
... but this failed due to goofys: https://github.com/IDR/idr-metadata/issues/671#issuecomment-1727715356
Goofys failed again, when re-running mkngff sql
...
File "/opt/omero/server/venv3/lib64/python3.6/site-packages/omero_mkngff/__init__.py", line 185, in sql
if not symlink_path.exists():
File "/usr/lib64/python3.6/pathlib.py", line 1336, in exists
self.stat()
File "/usr/lib64/python3.6/pathlib.py", line 1158, in stat
return self._accessor.stat(self)
File "/usr/lib64/python3.6/pathlib.py", line 387, in wrapped
return strfunc(str(pathobj), *args)
OSError: [Errno 107] Transport endpoint is not connected: '/bia-integrator-data/S-BIAD865/ffe4bcd6-a5dd-4c7f-ace2-751f67921207/ffe4bcd6-a5dd-4c7f-ace2-751f67921207.zarr'
A big problem with goofys failing (twice above) is that we need to restart the server to re-mount and this means that previously generated sql
become invalid due to a different $SECRET
being generated.
Need to move to a workflow of creating and executing the sql immediately...
for row in csv:
omero mkngff sql > fileset.sql
psql -f fileset.sql
for r in $(cat $IDRID.csv); do
biapath=$(echo $r | cut -d',' -f2)
uuid=$(echo $biapath | cut -d'/' -f2)
fsid=$(echo $r | cut -d',' -f3)
omero mkngff sql --symlink_repo /data/OMERO/ManagedRepository --secret=$SECRET $fsid "/bia-integrator-data/$biapath/$uuid.zarr" > "$IDRID/$fsid.sql"
psql -U omero -d idr -h $DBHOST -f "$IDRID/$fsid.sql"
done
http://localhost:1080/webclient/?show=well-802140 eventually viewable...
$ grep -A 2 "22.251_mkngff/04c70c80" /opt/omero/server/OMERO.server/var/log/Blitz-0.log | grep -A 2 "saved memo"
2023-09-20 15:27:16,224 DEBUG [ loci.formats.Memoizer] (l.Server-9) saved memo file: /data/OMERO/BioFormatsCache/data/OMERO/ManagedRepository/demo_2/2016-04/30/15-54-22.251_mkngff/04c70c80-bc2e-4210-a21f-d2f02108b829.zarr/OME/.METADATA.ome.xml.bfmemo (529578 bytes)
2023-09-20 15:27:16,224 DEBUG [ loci.formats.Memoizer] (l.Server-9) start[1695222972274] time[663949] tag[loci.formats.Memoizer.setId]
2023-09-20 15:27:16,224 INFO [ ome.io.nio.PixelsService] (l.Server-9) Creating BfPixelBuffer: /data/OMERO/ManagedRepository/demo_2/2016-04/30/15-54-22.251_mkngff/04c70c80-bc2e-4210-a21f-d2f02108b829.zarr/OME/METADATA.ome.xml Series: 0
663949 ms is 11 minutes
mkgff sql
failed again with goofys mount
Got about 40 complete - most others are 0
bytes.
$ ls -alh idr0013 | grep "r 4"
.sqlr--r--. 1 omero-server omero-server 486K Sep 20 14:41 18376
.sqlr--r--. 1 omero-server omero-server 484K Sep 20 14:29 18379
.sqlr--r--. 1 omero-server omero-server 486K Sep 20 14:19 18392
.sqlr--r--. 1 omero-server omero-server 486K Sep 20 14:38 18421
.sqlr--r--. 1 omero-server omero-server 486K Sep 20 14:51 18456
.sqlr--r--. 1 omero-server omero-server 486K Sep 20 14:16 18460
.sqlr--r--. 1 omero-server omero-server 486K Sep 20 15:34 18476
.sqlr--r--. 1 omero-server omero-server 463K Sep 20 15:37 18478
.sqlr--r--. 1 omero-server omero-server 482K Sep 20 14:25 18532
.sqlr--r--. 1 omero-server omero-server 486K Sep 20 15:59 18533
.sqlr--r--. 1 omero-server omero-server 486K Sep 20 15:49 18538
.sqlr--r--. 1 omero-server omero-server 484K Sep 20 15:11 18543
.sqlr--r--. 1 omero-server omero-server 484K Sep 20 15:21 18545
.sqlr--r--. 1 omero-server omero-server 481K Sep 20 14:32 18561
.sqlr--r--. 1 omero-server omero-server 481K Sep 20 14:07 18562
.sqlr--r--. 1 omero-server omero-server 481K Sep 20 14:35 18567
.sqlr--r--. 1 omero-server omero-server 481K Sep 20 15:31 18598
.sqlr--r--. 1 omero-server omero-server 481K Sep 20 14:13 18654
.sqlr--r--. 1 omero-server omero-server 478K Sep 20 15:46 18660
.sqlr--r--. 1 omero-server omero-server 481K Sep 20 15:43 18667
.sqlr--r--. 1 omero-server omero-server 481K Sep 20 14:22 18704
.sqlr--r--. 1 omero-server omero-server 481K Sep 20 14:54 18705
.sqlr--r--. 1 omero-server omero-server 481K Sep 20 15:25 18717
.sqlr--r--. 1 omero-server omero-server 481K Sep 20 13:58 18727
.sqlr--r--. 1 omero-server omero-server 481K Sep 20 14:44 18729
.sqlr--r--. 1 omero-server omero-server 486K Sep 20 15:18 18735
.sqlr--r--. 1 omero-server omero-server 481K Sep 20 15:06 18741
.sqlr--r--. 1 omero-server omero-server 481K Sep 20 14:57 18749
.sqlr--r--. 1 omero-server omero-server 481K Sep 20 13:55 18761
.sqlr--r--. 1 omero-server omero-server 481K Sep 20 15:53 18813
.sqlr--r--. 1 omero-server omero-server 481K Sep 20 15:56 18822
.sqlr--r--. 1 omero-server omero-server 481K Sep 20 14:10 18838
.sqlr--r--. 1 omero-server omero-server 481K Sep 20 15:28 18840
.sqlr--r--. 1 omero-server omero-server 481K Sep 20 14:47 18841
.sqlr--r--. 1 omero-server omero-server 481K Sep 20 15:00 18852
.sqlr--r--. 1 omero-server omero-server 481K Sep 20 15:15 18911
.sqlr--r--. 1 omero-server omero-server 481K Sep 20 14:04 18914
.sqlr--r--. 1 omero-server omero-server 481K Sep 20 14:01 18933
.sqlr--r--. 1 omero-server omero-server 481K Sep 20 15:40 18935
.sqlr--r--. 1 omero-server omero-server 433K Sep 20 15:03 22203
Kinda painful to pick up where we left off with mkngff sql
, since we don't have a good way to skip all the filesets that have been successfully processed.
Updated omero-mkngff
with https://github.com/IDR/omero-mkngff/pull/11/commits/a2d0aeeb5195e7374c7cb48e5d989d813a05f982
So now we output nothing if we have previously successfully generated sql output (as known by the existence of the symlink_dir in managed repo, which is now created after sql output).
Now we just need to update the command to append to the sql file instead of writing to it, to avoid overwriting the existing files.
We also want to use the old SECRET
from those existing sql files, so that the new ones are the same and we can do a global replace when needed.
export SECRET=b76bb9c5-92b7-42c7-809e-97c808b4598a
for r in $(cat $IDRID.csv); do
biapath=$(echo $r | cut -d',' -f2)
uuid=$(echo $biapath | cut -d'/' -f2)
fsid=$(echo $r | cut -d',' -f3)
omero mkngff sql --symlink_repo /data/OMERO/ManagedRepository --secret=$SECRET $fsid "/bia-integrator-data/$biapath/$uuid.zarr" >> "$IDRID/$fsid.sql"
done
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/2016-05/09/05-00-41.632 for fileset 18761
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/09/05-00-41.632
Symlink dir exists at /data/OMERO/ManagedRepository/demo_2/2016-05/09/05-00-41.632_mkngff - skipping sql output
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/2016-05/08/16-44-06.910 for fileset 18727
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/08/16-44-06.910
Symlink dir exists at /data/OMERO/ManagedRepository/demo_2/2016-05/08/16-44-06.910_mkngff - skipping sql output
...
# last fileset where symlink found - NB: this probably didn't output sql before!
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/2016-04/30/22-03-36.052 for fileset 18469
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-04/30/22-03-36.052
Symlink dir exists at /data/OMERO/ManagedRepository/demo_2/2016-04/30/22-03-36.052_mkngff - skipping sql output
# first fileset to generate sql in this round...
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/2016-05/01/05-35-47.122 for fileset 18479
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/01/05-35-47.122
Needed another server restart to re-mount goofys...
Re-ran again as above...
First fileset of this round 18800
...
Needed another server restart to re-mount goofys...
Re-ran again as above...
First fileset of this round 18386
...
Since running the mkngff for this and idr0016 at the same time on idr-testing is causing goofys issues, going to pause on this one now until idr0016 is done....
Picking up where we left off... Work out where to start....
for r in $(cat $IDRID.csv); do
fsid=$(echo $r | cut -d',' -f3)
ls -alh "$IDRID/$fsid.sql"
done
Kept these 4 rows (no sql exported) deleted the other completed rows from idr0013.csv on idr-testing.. 18761?.sql 18727?.sql 18933?.sql 18469?.sql 18458?.sql
for r in $(cat $IDRID.csv); do
biapath=$(echo $r | cut -d',' -f2)
uuid=$(echo $biapath | cut -d'/' -f2)
fsid=$(echo $r | cut -d',' -f3)
omero mkngff sql $fsid "/bia-integrator-data/$biapath/$uuid.zarr" > "$IDRID/$fsid.sql"
done
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix: demo_2/2016-05/09/05-00-41.632 for fileset: 18761
Repeated several times, each time processing 20 - 40 Filesets...
Restarted again... seems to be 39 or 40 each time.
(venv3) bash-4.2$ for r in $(cat $IDRID.csv); do biapath=$(echo $r | cut -d',' -f2); uuid=$(echo $biapath | cut -d'/' -f2); fsid=$(echo $r | cut -d',' -f3); omero mkngff sql $fsid "/bia-integrator-data/$biapath/$uuid.zarr" > "$IDRID/$fsid.sql"; done
Using session for public@idr.openmicroscopy.org:4064. Idle timeout: 10 min. Current group: Public
Found prefix: demo_2/2016-09/20/20-36-59.899 for fileset: 22207
Using session for public@idr.openmicroscopy.org:4064. Idle timeout: 10 min. Current group: Public
Found prefix: demo_2/2016-05/11/09-33-21.804 for fileset: 18867
...
Restarted again after another 39...
(venv3) bash-4.2$ for r in $(cat $IDRID.csv); do biapath=$(echo $r | cut -d',' -f2); uuid=$(echo $biapath | cut -d'/' -f2); fsid=$(echo $r | cut -d',' -f3); omero mkngff sql $fsid "/bia-integrator-data/$biapath/$uuid.zarr" > "$IDRID/$fsid.sql"; done
Using session for public@idr.openmicroscopy.org:4064. Idle timeout: 10 min. Current group: Public
Found prefix: demo_2/2016-05/07/19-34-24.204 for fileset: 18687
...
Need to fix naming of sql. Using fsid=$(echo $r | cut -d',' -f3)
this includes a line-break character if the csv
has been downloaded with wget https://raw.githubusercontent.com/IDR/idr-utils/cac35aa0d1731afb5db0ab6b60e10bdf03c591fd/scripts/ngff_filesets/idr0013.csv
We can use | tr -d '[:space:]'
to strip this off.
for r in $(cat $IDRID.csv); do
fsid=$(echo $r | cut -d',' -f3)
newid=$(echo $r | cut -d',' -f3 | tr -d '[:space:]')
mv "$IDRID/$fsid.sql" "$IDRID/$newid.sql"
done
mv: cannot stat ‘idr0013/18351\r.sql’: No such file or directory
mv: cannot stat ‘idr0013/18353.sql’: No such file or directory
Check for .zarray
files...
for r in $(cat $IDRID.csv); do
fsid=$(echo $r | cut -d',' -f3 | tr -d '[:space:]')
echo "$IDRID/$fsid.sql $(grep -c 'zarray' $IDRID/$fsid.sql)"
done
idr0013/18761.sql 1520
idr0013/18727.sql 1520
idr0013/18933.sql 1520
idr0013/18914.sql 0
idr0013/18562.sql 0
idr0013/18838.sql 0
idr0013/18654.sql 0
idr0013/18460.sql 0
idr0013/18392.sql 0
idr0013/18704.sql 0
idr0013/18532.sql 0
idr0013/18379.sql 0
idr0013/18561.sql 0
idr0013/18567.sql 0
idr0013/18421.sql 0
idr0013/18376.sql 0
idr0013/18729.sql 0
idr0013/18841.sql 0
idr0013/18456.sql 0
idr0013/18705.sql 0
idr0013/18749.sql 0
idr0013/18852.sql 0
idr0013/22203.sql 0
idr0013/18741.sql 0
idr0013/18823.sql 0
idr0013/18543.sql 0
idr0013/18911.sql 0
idr0013/18735.sql 0
idr0013/18545.sql 0
idr0013/18717.sql 0
idr0013/18840.sql 0
idr0013/18598.sql 0
idr0013/18476.sql 0
idr0013/18478.sql 0
idr0013/18935.sql 0
idr0013/18667.sql 0
idr0013/18660.sql 0
idr0013/18538.sql 0
idr0013/18813.sql 0
idr0013/18822.sql 0
idr0013/18533.sql 0
idr0013/18469.sql 1536
idr0013/18479.sql 0
idr0013/22216.sql 0
idr0013/18906.sql 0
idr0013/22223.sql 0
idr0013/18797.sql 0
idr0013/18352.sql 0
idr0013/18355.sql 0
idr0013/22206.sql 0
idr0013/18578.sql 0
idr0013/18707.sql 0
idr0013/18766.sql 0
idr0013/18855.sql 0
idr0013/18802.sql 0
idr0013/18462.sql 0
idr0013/18601.sql 0
idr0013/18775.sql 0
idr0013/18381.sql 0
idr0013/18800.sql 0
idr0013/18763.sql 0
idr0013/18767.sql 0
idr0013/18915.sql 0
idr0013/18520.sql 0
idr0013/18725.sql 0
idr0013/18777.sql 0
idr0013/18869.sql 0
idr0013/18411.sql 0
idr0013/18512.sql 0
idr0013/18383.sql 0
idr0013/18737.sql 0
idr0013/18839.sql 0
idr0013/18701.sql 0
idr0013/18662.sql 0
idr0013/18833.sql 0
idr0013/18836.sql 0
idr0013/18784.sql 0
idr0013/18472.sql 0
idr0013/18923.sql 0
idr0013/18594.sql 0
idr0013/18529.sql 0
idr0013/18361.sql 0
idr0013/18528.sql 0
idr0013/18747.sql 0
idr0013/18464.sql 0
idr0013/18848.sql 0
idr0013/18765.sql 0
idr0013/18826.sql 0
idr0013/18799.sql 0
idr0013/18661.sql 0
idr0013/18470.sql 0
idr0013/18948.sql 0
idr0013/18864.sql 0
idr0013/18732.sql 0
idr0013/18790.sql 0
idr0013/18953.sql 0
idr0013/18386.sql 0
idr0013/18716.sql 0
idr0013/18787.sql 0
idr0013/18461.sql 0
idr0013/18384.sql 0
idr0013/22227.sql 0
idr0013/18947.sql 0
idr0013/18566.sql 0
idr0013/22222.sql 0
idr0013/18774.sql 0
idr0013/18924.sql 0
idr0013/18391.sql 0
idr0013/18401.sql 0
idr0013/18858.sql 0
idr0013/22204.sql 0
idr0013/18580.sql 0
idr0013/18862.sql 0
idr0013/18490.sql 0
idr0013/18936.sql 0
idr0013/18870.sql 0
idr0013/22211.sql 0
idr0013/18828.sql 0
idr0013/22209.sql 0
idr0013/18754.sql 0
idr0013/18465.sql 0
idr0013/18523.sql 0
idr0013/18670.sql 0
idr0013/18579.sql 0
idr0013/18473.sql 0
idr0013/18958.sql 0
idr0013/18577.sql 0
idr0013/18957.sql 0
idr0013/18463.sql 0
idr0013/18589.sql 0
idr0013/18748.sql 0
idr0013/18359.sql 0
idr0013/18354.sql 0
idr0013/18752.sql 0
idr0013/18454.sql 0
idr0013/18824.sql 0
idr0013/18909.sql 0
idr0013/18542.sql 0
idr0013/18403.sql 0
idr0013/18931.sql 0
idr0013/18695.sql 0
idr0013/18489.sql 0
idr0013/18853.sql 0
idr0013/18718.sql 0
idr0013/18358.sql 0
idr0013/18902.sql 0
idr0013/18771.sql 0
idr0013/18604.sql 0
idr0013/18788.sql 0
idr0013/18491.sql 0
idr0013/18700.sql 0
idr0013/18943.sql 0
idr0013/18683.sql 0
idr0013/18846.sql 0
idr0013/22210.sql 0
idr0013/18803.sql 0
idr0013/18918.sql 0
idr0013/18455.sql 0
idr0013/18521.sql 0
idr0013/18844.sql 0
idr0013/18926.sql 0
idr0013/18863.sql 0
idr0013/18843.sql 0
idr0013/18730.sql 0
idr0013/18920.sql 0
idr0013/18585.sql 0
idr0013/18366.sql 0
idr0013/18458.sql 1536
idr0013/18760.sql 1520
idr0013/18804.sql 1520
idr0013/18574.sql 1520
Edited idr0013.csv to contain just the 163 rows with 0
above. Re-ran...
for r in $(cat $IDRID.csv); do
biapath=$(echo $r | cut -d',' -f2)
uuid=$(echo $biapath | cut -d'/' -f2)
fsid=$(echo $r | cut -d',' -f3 | tr -d '[:space:]')
omero mkngff sql $fsid "/bia-integrator-data/$biapath/$uuid.zarr" > "$IDRID/$fsid.sql"
done
Using session for public@idr.openmicroscopy.org:4064. Idle timeout: 10 min. Current group: Public
Found prefix: demo_2/2016-05/12/13-09-25.587 for fileset: 18914
Since idr0138-pilot
seems to have much more stable goofys
mount, move remaining generation there....
Still to do "idr0013.csv"... on idr0138-pilot... as wmoore user...
idr0013/LT0099_16.ome.zarr,S-BIAD865/2fddf4f4-bbad-490e-9d1a-64f10a911f5f,18716
idr0013/LT0121_09.ome.zarr,S-BIAD865/30078617-8947-451e-b4fc-b5459f8d787d,18787
idr0013/LT0025_54.ome.zarr,S-BIAD865/3068778b-ca4a-409f-8a91-a436aaefd539,18461
idr0013/LT0011_30.ome.zarr,S-BIAD865/3092cb82-f48f-4918-9a2d-a159ff420623,18384
idr0013/LTValidMitosisSon384Plate01_02.ome.zarr,S-BIAD865/3215294d-e302-43e8-a96f-0a0dd44f10a6,22227
idr0013/LT0601_01.ome.zarr,S-BIAD865/32f78fc1-3cb0-4ef5-96ff-a7521a1c5d28,18947
idr0013/LT0066_02.ome.zarr,S-BIAD865/333b0032-273f-470d-be49-b944b4191327,18566
idr0013/LTValidMitosisSon384Plate02_04.ome.zarr,S-BIAD865/33bd6e90-8597-445f-a6a1-6f03216902c1,22222
idr0013/LT0116_47.ome.zarr,S-BIAD865/340f3f55-2286-4fa2-8c01-e049bbd86d5d,18774
idr0013/LT0153_06.ome.zarr,S-BIAD865/34eea383-ae3d-4c39-ad85-127571a58957,18924
idr0013/LT0014_01.ome.zarr,S-BIAD865/350edb2c-befa-4ebd-b130-4f5d88fd18b8,18391
idr0013/LT0016_18.ome.zarr,S-BIAD865/364084b9-d7af-4600-b6e3-0621bd50c563,18401
idr0013/LT0142_01.ome.zarr,S-BIAD865/364309c6-4bd0-469d-ad6b-981cb86ac9c0,18858
idr0013/LTValidMitosisSon384Plate07_01.ome.zarr,S-BIAD865/369313de-98e2-44d7-9362-d9c710ade6dd,22204
idr0013/LT0070_41.ome.zarr,S-BIAD865/36a2e3d5-72e4-4652-af7f-929161e2322d,18580
idr0013/LT0143_02.ome.zarr,S-BIAD865/3726dab9-e7a3-4df2-a594-aaeaa9f94d95,18862
idr0013/LT0033_42.ome.zarr,S-BIAD865/376cff9a-a923-4f19-957e-1c4c644b39c5,18490
idr0013/LT0157_07.ome.zarr,S-BIAD865/381f57c9-d2cc-4e33-a0da-cfff6357d9ae,18936
idr0013/LT0145_02.ome.zarr,S-BIAD865/38689649-4f4c-4983-9840-25e2d5f058a5,18870
idr0013/LTValidMitosisSon384Plate05_02.ome.zarr,S-BIAD865/386a44d6-1132-4d0d-abf7-180764320c63,22211
idr0013/LT0133_19.ome.zarr,S-BIAD865/38e77549-b1d0-4559-918b-85da280e9949,18828
idr0013/LTValidMitosisSon384Plate05_04.ome.zarr,S-BIAD865/3932254b-22f9-487d-80b1-b9c2daa7bf46,22209
idr0013/LT0110_09.ome.zarr,S-BIAD865/39885715-f764-46a1-b045-dd423db83c63,18754
idr0013/LT0026_21.ome.zarr,S-BIAD865/3a0f0b01-39aa-4745-aebd-1719c1796206,18465
idr0013/LT0042_28.ome.zarr,S-BIAD865/3a54eeb7-9e0a-438b-8993-926a9ad10689,18523
idr0013/LT0085_07.ome.zarr,S-BIAD865/3b4f9774-4a00-489d-89a3-0d2aeca87835,18670
idr0013/LT0069_52.ome.zarr,S-BIAD865/3c36a642-5b4f-4c11-8e6d-baa0f4178c9b,18579
idr0013/LT0029_01.ome.zarr,S-BIAD865/3ca89c81-1eea-49ac-b7da-fee5f5f945af,18473
idr0013/LT0603_05.ome.zarr,S-BIAD865/3caeca4e-c69c-4a1a-a98e-bb0f83ee6a0c,18958
idr0013/LT0069_51.ome.zarr,S-BIAD865/3cc6b15c-13b0-417b-a249-57932368b51e,18577
idr0013/LT0603_06.ome.zarr,S-BIAD865/3d461dd5-bec1-43fc-8fc3-2406f1d2bb72,18957
idr0013/LT0025_56.ome.zarr,S-BIAD865/3d4a9c7f-944a-40f0-b872-3da8ff3557ff,18463
idr0013/LT0073_02.ome.zarr,S-BIAD865/3d5ac001-823d-4c7a-83a4-e29c826f81e0,18589
idr0013/LT0108_47.ome.zarr,S-BIAD865/3e550b11-5e87-4587-8b8a-f7653fadab9b,18748
idr0013/LT0003_40.ome.zarr,S-BIAD865/3e7ad301-3cad-413a-9e79-571a691712bf,18359
idr0013/LT0002_02.ome.zarr,S-BIAD865/3e7aeaeb-4de8-42b9-bed3-2c4af89a0bf7,18354
idr0013/LT0110_01.ome.zarr,S-BIAD865/3edb1d3a-91da-48a9-b6a4-592328ea5f1c,18752
idr0013/LT0023_01.ome.zarr,S-BIAD865/40aadbcb-77df-4663-a5f8-29177971b58b,18454
idr0013/LT0132_04.ome.zarr,S-BIAD865/40e83a42-6bc0-4f3b-80f9-80a865ac5424,18824
idr0013/LT0148_37.ome.zarr,S-BIAD865/4251a3eb-043c-4abe-9326-3e2afb9f6e97,18909
idr0013/LT0049_02.ome.zarr,S-BIAD865/427c1e16-5bee-425f-ae65-163a4db18e54,18542
idr0013/LT0016_28.ome.zarr,S-BIAD865/42a137f5-6f48-4873-9d66-fac6367a802b,18403
idr0013/LT0156_07.ome.zarr,S-BIAD865/4399d284-8a5c-47f3-9169-007d2f0cad27,18931
idr0013/LT0093_16.ome.zarr,S-BIAD865/4421634f-208a-4d43-88c8-80b5c8caa056,18695
idr0013/LT0033_11.ome.zarr,S-BIAD865/448ecf99-dba9-4e72-8edb-f8e03453c292,18489
idr0013/LT0140_06.ome.zarr,S-BIAD865/44bff916-3cc4-4f8b-a185-fabcf82b5e01,18853
idr0013/LT0100_09.ome.zarr,S-BIAD865/44c62d0b-c9e8-42e7-97e8-37592f26ba75,18718
idr0013/LT0003_15.ome.zarr,S-BIAD865/44e8361b-2bbd-4f01-ba02-f3333a34a5c4,18358
idr0013/LT0146_06.ome.zarr,S-BIAD865/44f07347-2f3d-4d65-ad1c-6c376577862a,18902
idr0013/LT0116_43.ome.zarr,S-BIAD865/44f3c26d-65e2-43d0-9d2e-75ac4673f210,18771
idr0013/LT0077_01.ome.zarr,S-BIAD865/45eb9b6b-f72f-42cc-b0c8-19923f2c6d92,18604
idr0013/LT0121_37.ome.zarr,S-BIAD865/46b7571c-679a-4aba-ba66-c1e608eb803d,18788
idr0013/LT0034_01.ome.zarr,S-BIAD865/4706dd97-c751-447d-b8ef-6dc9ea68dea7,18491
idr0013/LT0094_44.ome.zarr,S-BIAD865/497cb9a3-4e13-4498-aae5-c4b291515352,18700
idr0013/LT0170_01.ome.zarr,S-BIAD865/4a33abd2-9f15-4ddb-9cc6-faf7cffb4960,18943
idr0013/LT0089_02.ome.zarr,S-BIAD865/4a3ace35-8cb0-459a-8609-c78f99cb79a5,18683
idr0013/LT0138_03.ome.zarr,S-BIAD865/4a96176c-6d36-4ce5-a9d6-ed5cca52cbeb,18846
idr0013/LTValidMitosisSon384Plate05_03.ome.zarr,S-BIAD865/4acc4a36-2066-43ee-9a7f-756733f1e379,22210
idr0013/LT0125_41.ome.zarr,S-BIAD865/4b271b6d-1dd3-4079-9e40-4153e13f56ae,18803
idr0013/LT0151_08.ome.zarr,S-BIAD865/4b390ccd-714f-4452-aae7-5db76302337b,18918
idr0013/LT0023_04.ome.zarr,S-BIAD865/4c512657-5553-41b4-a77c-6df1f562ff05,18455
idr0013/LT0042_10.ome.zarr,S-BIAD865/4c5e7b2b-f19d-4bdd-ae88-1d1bb0c3c869,18521
idr0013/LT0138_01.ome.zarr,S-BIAD865/4f0ab5bc-90f9-474d-8b0d-0f2303f94593,18844
idr0013/LT0154_02.ome.zarr,S-BIAD865/4f84d491-654e-4c1b-b39e-16258cbb7056,18926
idr0013/LT0143_05.ome.zarr,S-BIAD865/4fd3c599-6c09-4a63-bfe8-cc345ea99002,18863
idr0013/LT0137_44.ome.zarr,S-BIAD865/50991552-7af6-40b0-813a-e03bc6590cd1,18843
idr0013/LT0104_04.ome.zarr,S-BIAD865/50be2b3c-b163-4363-9bdd-5be0651f2b03,18730
idr0013/LT0152_04.ome.zarr,S-BIAD865/50f78452-8396-401b-9aeb-d9982ddbca0b,18920
idr0013/LT0072_02.ome.zarr,S-BIAD865/513a062e-2307-40c5-8f6b-57761e9b502f,18585
idr0013/LT0006_10.ome.zarr,S-BIAD865/5147e4d3-bec9-4166-b63a-dbe5f5008f52,18366
Following Images/Filesets found to be incomplete when regenerating memo files on idr-testing...
LT0066_23
2876 files in Fileset - All Well M/1 files missing, and M/.zgroup - No M directory files in sql: https://ome.github.io/ome-ngff-validator/?source=https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/S-BIAD865/dab29e5a-d36f-430a-a9ff-7a1d6e4ce299/dab29e5a-d36f-430a-a9ff-7a1d6e4ce299.zarrLT0080_37
2866 files in Fileset - Missing .zarr/.zattrs and all /ALT0103_13
2770 files in Fileset - Missing .zarr/.zattrs and all /AOn pilot-zarr1-dev, screen
$ screen -r idr0015_ngff
$ cd /data/idr0013
$ conda activate bioformats2raw2
$ for i in LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3 LT0080_37--ex2005_07_20--sp2005_07_04--tt17--c4 LT0103_13--ex2006_11_22--sp2005_08_16--tt19--c4; do
~/bioformats2raw-0.6.0-24/bin/bioformats2raw --memo-directory /../memo /uod/idr/metadata/idr0013-neumann-mitocheck/screens/$i.screen $i.ome.zarr; done
Can't seem to read the data...
$ sudo ls /uod/idr/filesets/idr0013-neumann-mitocheck/
ls: cannot open directory /uod/idr/filesets/idr0013-neumann-mitocheck/: Permission denied
EDIT: seems to work when I'm not in that old screen.
Created screen -S idr0013_bf2raw
and ran again... 10:35...
Checking that files missing from previous plates are present in newly-generated ones...
This was missing M/1
Well before, but seems to have the same number of files as other Wells now...
(base) [wmoore@pilot-zarr1-dev idr0013]$ find LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr/M/1 -type f | wc
478 478 36242
(base) [wmoore@pilot-zarr1-dev idr0013]$ find LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr/M/2 -type f | wc
478 478 36242
(base) [wmoore@pilot-zarr1-dev idr0013]$ find LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr/A/1 -type f | wc
478 478 36242
Similar checks with the other plates for .zattrs
etc and /A
all look good...
Renamed to shorten names...
(base) [wmoore@pilot-zarr1-dev idr0013]$ ls -lh
total 0
drwxrwxr-x. 19 wmoore wmoore 271 Nov 14 14:37 LT0066_23.ome.zarr
drwxrwxr-x. 19 wmoore wmoore 271 Nov 14 12:18 LT0080_37.ome.zarr
drwxrwxr-x. 19 wmoore wmoore 271 Nov 14 13:32 LT0103_13.ome.zarr
$ for i in $(ls); do zip -r $i.zip $i; done
...
EDIT: oops - realised that previous idr0013 plates have full names, not shortened as above. Re-named back to full names and zipped them..
$ md5sum ./*
2dc74001d737bf48841ea4a186391574 LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr.zip
bc35cca08c935c765df6a3d1b1198732 LT0103_13--ex2006_11_22--sp2005_08_16--tt19--c4.ome.zarr.zip
5ad963825e2e3c5ccc5c2a5060819e7f LT0080_37--ex2005_07_20--sp2005_07_04--tt17--c4.ome.zarr.zip
Delete these 3 from https://www.ebi.ac.uk/biostudies/submissions/files?path=%2Fuser%2Fidr0013
Upload...
$ cd .aspera/cli/bin
$ ./ascp -P33001 -i ~/.aspera/cli/etc/asperaweb_id_dsa.openssh -d /data/idr0013/idr0013 bsaspera_w@hx-fasp-1.ebi.ac.uk:/5f/13xxxxx
LT0066_23--ex2005_08_03--sp2005_06_07--tt17-- 100% 24GB 128Mb/s 12:05
LT0080_37--ex2005_07_20--sp2005_07_04--tt17-- 100% 25GB 247Mb/s 26:43
LT0103_13--ex2006_11_22--sp2005_08_16--tt19-- 100% 24GB 377Mb/s 48:59
Checked https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/pages/S-BIAD865.html again. Resubmitted plates above not updated yet...
LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3 https://ome.github.io/ome-ngff-validator/?source=https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/S-BIAD865/dab29e5a-d36f-430a-a9ff-7a1d6e4ce299/dab29e5a-d36f-430a-a9ff-7a1d6e4ce299.zarr
LT0103_13--ex2006_11_22--sp2005_08_16--tt19--c4 https://ome.github.io/ome-ngff-validator/?source=https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/S-BIAD865/df947dfe-ed8f-4dda-a20a-fb9f3a717b47/df947dfe-ed8f-4dda-a20a-fb9f3a717b47.zarr
LT0080_37--ex2005_07_20--sp2005_07_04--tt17--c4 https://ome.github.io/ome-ngff-validator/?source=https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/S-BIAD865/8387705b-16bf-4b14-8884-426b0c16dfff/8387705b-16bf-4b14-8884-426b0c16dfff.zarr
Let's host those 3 plates on our s3 for testing mkngff etc.
$ aws --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3 mb s3://idr0013
make_bucket: idr0013
(base) [wmoore@pilot-zarr1-dev idr0013]$ /home/wmoore/mc cp -r idr0013/ uk1s3/idr0013
...tt19--c4.ome.zarr/P/9/0/3/92/0/0/0/0: 102.77 GiB / 102.77 GiB ━━━━━━━━━━━━━━━ 18.86 MiB/s 1h32m59s
Looking good:
On idr0125-pilot...
ssh -A -o 'ProxyCommand ssh idr-pilot.openmicroscopy.org -W %h:%p' idr0125-omeroreadwrite -L 1080:localhost:80
sudo mkdir /idr0013 && sudo /opt/goofys --endpoint https://uk1s3.embassy.ebi.ac.uk/ -o allow_other idr0013 /idr0013
ls /idr0013
LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr LT0080_37--ex2005_07_20--sp2005_07_04--tt17--c4.ome.zarr LT0103_13--ex2006_11_22--sp2005_08_16--tt19--c4.ome.zarr
As omero-server
user...
idr0013.csv
LT0066_23,LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr,18568
LT0080_37,LT0080_37--ex2005_07_20--sp2005_07_04--tt17--c4.ome.zarr,18655
LT0103_13,LT0103_13--ex2006_11_22--sp2005_08_16--tt19--c4.ome.zarr,18728
screen -r mkngff
for r in $(cat $IDRID.csv); do
zarrpath=$(echo $r | cut -d',' -f2)
fsid=$(echo $r | cut -d',' -f3 | tr -d '[:space:]')
omero mkngff sql $fsid --clientpath="https://uk1s3.embassy.ebi.ac.uk/idr0013/$zarrpath" "/idr0013/$zarrpath" > "$IDRID/$fsid.sql"
done
Check sql output - all have .zarr/.zattrs
...
(venv3) (base) bash-4.2$ for i in 18568.sql 18655.sql 18728.sql; do echo $i; cat $i | grep ".zarr/.zattrs" | wc; cat $i | grep ".zattrs" | wc; done
18568.sql
1 4 258
762 3048 205148
18655.sql
1 4 258
762 3048 205148
18728.sql
1 4 258
738 2952 198688
$ less 18568.sql...
UPDATE pixels SET name = 'METADATA.ome.xml', path = 'demo_2/2016-05/03/23-33-31.705_mkngff/LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr/OME' where image in (select id from Image where fileset = 18568);
begin;
select mkngff_fileset(
18568,
'SECRETUUID',
'cdf35825-def1-4580-8d0b-9c349b8f78d6',
'demo_2/2016-05/03/23-33-31.705_mkngff/',
array[
['demo_2/2016-05/03/23-33-31.705_mkngff/LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr/', '.zattrs', 'application/octet-stream', 'https://uk1s3.embassy.ebi.ac.uk/idr0013/LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr/.zattrs'],
['demo_2/2016-05/03/23-33-31.705_mkngff/LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr/', '.zgroup', 'application/octet-stream', 'https://uk1s3.embassy.ebi.ac.uk/idr0013/LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr/.zgroup'],
['demo_2/2016-05/03/23-33-31.705_mkngff/LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr/A/', '.zgroup', 'application/octet-stream', 'https://uk1s3.embassy.ebi.ac.uk/idr0013/LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr/A/.zgroup'],
...
Updated SECRET to 9630ba1e-ed3a-42e3-9296-xxxxxxxx then ran
for r in $(cat $IDRID.csv); do
zarrpath=$(echo $r | cut -d',' -f2)
fsid=$(echo $r | cut -d',' -f3 | tr -d '[:space:]')
psql -U omero -d idr -h $DBHOST -f "$IDRID/$fsid.sql"
omero mkngff symlink /data/OMERO/ManagedRepository "/idr0013/$zarrpath" --bfoptions
done
UPDATE 380
BEGIN
mkngff_fileset
----------------
5289227
(1 row)
COMMIT
usage: /opt/omero/server/venv3/bin/omero mkngff symlink [-h] [--bfoptions]
symlink_repo
fileset_id
symlink_target
/opt/omero/server/venv3/bin/omero mkngff symlink: error: argument fileset_id: invalid int value: '/idr0013/LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr'
UPDATE 380
BEGIN
mkngff_fileset
----------------
5289228
(1 row)
COMMIT
usage: /opt/omero/server/venv3/bin/omero mkngff symlink [-h] [--bfoptions]
symlink_repo
fileset_id
symlink_target
/opt/omero/server/venv3/bin/omero mkngff symlink: error: argument fileset_id: invalid int value: '/idr0013/LT0080_37--ex2005_07_20--sp2005_07_04--tt17--c4.ome.zarr'
UPDATE 368
BEGIN
mkngff_fileset
----------------
5289229
(1 row)
COMMIT
usage: /opt/omero/server/venv3/bin/omero mkngff symlink [-h] [--bfoptions]
symlink_repo
fileset_id
symlink_target
/opt/omero/server/venv3/bin/omero mkngff symlink: error: argument fileset_id: invalid int value: '/idr0013/LT0103_13--ex2006_11_22--sp2005_08_16--tt19--c4.ome.zarr'
Ooops.... re-ran symlinks....
$ for r in $(cat $IDRID.csv); do
> zarrpath=$(echo $r | cut -d',' -f2)
> fsid=$(echo $r | cut -d',' -f3 | tr -d '[:space:]')
> echo $zarrpath
> echo $fsid
> omero mkngff symlink /data/OMERO/ManagedRepository $fsid "/idr0013/$zarrpath" --bfoptions
> done
LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr
18568
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/03/23-33-31.705
Creating dir at /data/OMERO/ManagedRepository/demo_2/2016-05/03/23-33-31.705_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2016-05/03/23-33-31.705_mkngff/LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr -> /idr0013/LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/03/23-33-31.705
write bfoptions to: /data/OMERO/ManagedRepository/demo_2/2016-05/03/23-33-31.705_mkngff/LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr.bfoptions
LT0080_37--ex2005_07_20--sp2005_07_04--tt17--c4.ome.zarr
18655
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/07/02-36-52.924
Creating dir at /data/OMERO/ManagedRepository/demo_2/2016-05/07/02-36-52.924_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2016-05/07/02-36-52.924_mkngff/LT0080_37--ex2005_07_20--sp2005_07_04--tt17--c4.ome.zarr -> /idr0013/LT0080_37--ex2005_07_20--sp2005_07_04--tt17--c4.ome.zarr
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/07/02-36-52.924
write bfoptions to: /data/OMERO/ManagedRepository/demo_2/2016-05/07/02-36-52.924_mkngff/LT0080_37--ex2005_07_20--sp2005_07_04--tt17--c4.ome.zarr.bfoptions
LT0103_13--ex2006_11_22--sp2005_08_16--tt19--c4.ome.zarr
18728
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/08/17-02-05.805
Creating dir at /data/OMERO/ManagedRepository/demo_2/2016-05/08/17-02-05.805_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2016-05/08/17-02-05.805_mkngff/LT0103_13--ex2006_11_22--sp2005_08_16--tt19--c4.ome.zarr -> /idr0013/LT0103_13--ex2006_11_22--sp2005_08_16--tt19--c4.ome.zarr
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/08/17-02-05.805
write bfoptions to: /data/OMERO/ManagedRepository/demo_2/2016-05/08/17-02-05.805_mkngff/LT0103_13--ex2006_11_22--sp2005_08_16--tt19--c4.ome.zarr.bfoptions
Fileset info looks good...
(base) [wmoore@pilot-idr0125-omeroreadwrite ~]$ ls -alh /data/OMERO/ManagedRepository/demo_2/2016-05/03/23-33-31.705_mkngff
total 12K
drwxr-xr-x. 2 omero-server omero-server 144 Jan 3 11:26 .
drwxr-xr-x. 63 omero-server omero-server 4.0K Jan 3 11:26 ..
lrwxrwxrwx. 1 omero-server omero-server 65 Jan 3 11:26 LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr -> /idr0013/LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr
-rw-r--r--. 1 omero-server omero-server 49 Jan 3 11:26 LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr.bfoptions
Checking http://localhost:1080/webclient/?show=image-1556033 - view image.... Looks good. Other plates: http://localhost:1080/webclient/?show=image-1573071... and LT0103_13
Lets check_pixels...
for i in 3669 3669 3828; do
python check_pixels.py Plate:$i --max-planes=sizeC --max-images=10 >> /tmp/check_pix_20240301_idr0013.log;
done
$ grep Error /tmp/check_pix_20240301_idr0013.log | wc
0 0 0
We have re-submitted data now available on EBI s3...
Test on idr-testing, using Fileset IDs from idr-testing!
Install https://github.com/IDR/omero-mkngff/pull/14 to create new Filesets without extra _mkngff
suffix...
And use --fs_suffix=None
below...
pip install 'omero-mkngff @ git+https://github.com/will-moore/omero-mkngff@fs_suffix'
idr0013.csv
idr0013/LT0080_37.ome.zarr.zip,S-BIAD865/aea4aa32-60c2-4a38-8a91-9f303381e562,6312927
idr0013/LT0066_23.ome.zarr.zip,S-BIAD865/c1d9f06e-cfd0-43cd-be2f-3e5f39c3b62a,6313098
idr0013/LT0103_13.ome.zarr.zip,S-BIAD865/eae9bb4c-9504-4f88-9931-dbf234f86023,6313107
export IDRID-idr0013
for r in $(cat $IDRID.csv); do
biapath=$(echo $r | cut -d',' -f2)
uuid=$(echo $biapath | cut -d'/' -f2)
fsid=$(echo $r | cut -d',' -f3 | tr -d '[:space:]')
omero mkngff sql $fsid --fs_suffix=None --clientpath="https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/$biapath/$uuid.zarr" "/bia-integrator-data/$biapath/$uuid.zarr" > "$IDRID/$fsid.sql"
done
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix: demo_2/2016-05/07/02-36-52.924_mkngff for fileset: 6312927
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix: demo_2/2016-05/03/23-33-31.705_mkngff for fileset: 6313098
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix: demo_2/2016-05/08/17-02-05.805_mkngff for fileset: 6313107
Then, update SECRET and... (again using --fs_suffix=None
)...
for i in $(ls); do sed -i 's/SECRETUUID/f464e059-16b5-4013-b9a2-417e5976371c/g' $i; done
for r in $(cat $IDRID.csv); do
biapath=$(echo $r | cut -d',' -f2)
uuid=$(echo $biapath | cut -d'/' -f2)
fsid=$(echo $r | cut -d',' -f3 | tr -d '[:space:]')
psql -U omero -d idr -h $DBHOST -f "$IDRID/$fsid.sql"
omero mkngff symlink /data/OMERO/ManagedRepository $fsid "/bia-integrator-data/$biapath/$uuid.zarr" --fs_suffix=None --bfoptions
done
UPDATE 380
BEGIN
mkngff_fileset
----------------
6314896
(1 row)
COMMIT
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/07/02-36-52.924_mkngff
Creating dir at /data/OMERO/ManagedRepository/demo_2/2016-05/07/02-36-52.924_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2016-05/07/02-36-52.924_mkngff/aea4aa32-60c2-4a38-8a91-9f303381e562.zarr -> /bia-integrator-data/S-BIAD865/aea4aa32-60c2-4a38-8a91-9f303381e562/aea4aa32-60c2-4a38-8a91-9f303381e562.zarr
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/07/02-36-52.924_mkngff
write bfoptions to: /data/OMERO/ManagedRepository/demo_2/2016-05/07/02-36-52.924_mkngff/aea4aa32-60c2-4a38-8a91-9f303381e562.zarr.bfoptions
UPDATE 380
BEGIN
mkngff_fileset
----------------
6314897
(1 row)
COMMIT
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/03/23-33-31.705_mkngff
Creating dir at /data/OMERO/ManagedRepository/demo_2/2016-05/03/23-33-31.705_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2016-05/03/23-33-31.705_mkngff/c1d9f06e-cfd0-43cd-be2f-3e5f39c3b62a.zarr -> /bia-integrator-data/S-BIAD865/c1d9f06e-cfd0-43cd-be2f-3e5f39c3b62a/c1d9f06e-cfd0-43cd-be2f-3e5f39c3b62a.zarr
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/03/23-33-31.705_mkngff
write bfoptions to: /data/OMERO/ManagedRepository/demo_2/2016-05/03/23-33-31.705_mkngff/c1d9f06e-cfd0-43cd-be2f-3e5f39c3b62a.zarr.bfoptions
UPDATE 368
BEGIN
mkngff_fileset
----------------
6314898
(1 row)
COMMIT
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/08/17-02-05.805_mkngff
Creating dir at /data/OMERO/ManagedRepository/demo_2/2016-05/08/17-02-05.805_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2016-05/08/17-02-05.805_mkngff/eae9bb4c-9504-4f88-9931-dbf234f86023.zarr -> /bia-integrator-data/S-BIAD865/eae9bb4c-9504-4f88-9931-dbf234f86023/eae9bb4c-9504-4f88-9931-dbf234f86023.zarr
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/08/17-02-05.805_mkngff
write bfoptions to: /data/OMERO/ManagedRepository/demo_2/2016-05/08/17-02-05.805_mkngff/eae9bb4c-9504-4f88-9931-dbf234f86023.zarr.bfoptions
Updated sql scripts to use original Fileset IDs in https://github.com/IDR/mkngff_upgrade_scripts/commit/3f8e1693ebbb5032ec81e0c63168e99c1be633b8
idr0013-neumann-mitocheck