IDR / idr-metadata

Curated metadata for all studies published in the Image Data Resource
https://idr.openmicroscopy.org
14 stars 24 forks source link

idr0064-goglia-erkdynamics S-BIAD992 #682

Open will-moore opened 11 months ago

will-moore commented 11 months ago
$ ssh pilot-zarr1-dev

$ screen -S idr0064_bf2raw
$ cd /data/idr0064

(base) [wmoore@pilot-zarr1-dev idr0064]$ conda activate bioformats2raw2

for i in 1.1 1.2 1.3 2.1 2.2 2.3 3.1 3.2 3.3; do
  echo $i;
  ~/bioformats2raw-0.6.0-24/bin/bioformats2raw "/uod/idr/metadata/idr0064-goglia-erkdynamics/screens/$i.screen" "$i.ome.zarr";
done
will-moore commented 11 months ago

Zip without -m deletion...

for i in */; do zip -r "${i%/}.zip" "$i"; done
will-moore commented 11 months ago

Use our s3 for testing (and mkngff?)

$ aws --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3 mb s3://idr0064
make_bucket: idr0064

$ aws --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3api put-bucket-policy --bucket idr0064 --policy file://policy.json
$ aws --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3api put-bucket-cors --bucket idr0064 --cors-configuration file://cors.json
/home/wmoore/mc cp -r idr0064/ uk1s3/idr0064

Looks good at https://ome.github.io/ome-ngff-validator/?source=https://uk1s3.embassy.ebi.ac.uk/idr0064/1.1.ome.zarr Checked all other plates too 👍

will-moore commented 11 months ago

mkngff...

idr0064.csv

1.1,1.1,3549251
1.2,1.2,3549252
1.3,1.3,3549253
2.1,2.1,3549254
2.2,2.2,3549255
2.3,2.3,3549256
3.1,3.1,3549257
3.2,3.2,3549258
3.3,3.3,3549259

On idr0125-pilot...

$ sudo mkdir /idr0064 && sudo /opt/goofys --endpoint https://uk1s3.embassy.ebi.ac.uk/ -o allow_other idr0064 /idr0064
$ ls /idr0064
1.1.ome.zarr  1.2.ome.zarr  1.3.ome.zarr  2.1.ome.zarr  2.2.ome.zarr  2.3.ome.zarr  3.1.ome.zarr  3.2.ome.zarr  3.3.ome.zarr

This took less than a minute...

cd
vi idr0064.csv (above)
mkdir idr0064
omero login     # idr.openmicroscopy.org
export IDRID=idr0064
for r in $(cat $IDRID.csv); do
  platename=$(echo $r | cut -d',' -f2)
  fsid=$(echo $r | cut -d',' -f3 | tr -d '[:space:]')
  omero mkngff sql $fsid --clientpath="https://uk1s3.embassy.ebi.ac.uk/idr0064/$platename.ome.zarr" "/idr0064/$platename.ome.zarr" > "$IDRID/$fsid.sql"
done
for r in $(cat $IDRID.csv); do
  platename=$(echo $r | cut -d',' -f2)
  fsid=$(echo $r | cut -d',' -f3 | tr -d '[:space:]')
  psql -U omero -d idr -h $DBHOST -f "$IDRID/$fsid.sql"
  omero mkngff symlink /data/OMERO/ManagedRepository $fsid "/idr0064/$platename.zarr" --bfoptions
done

 psql -U omero -d idr -h $DBHOST -f "$IDRID/$fsid.sql"
>   omero mkngff symlink /data/OMERO/ManagedRepository $fsid "/idr0064/$platename.zarr" --bfoptions
> done
UPDATE 50
BEGIN
 mkngff_fileset 
----------------
        5289217
(1 row)

COMMIT
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-3/2020-04/23/14-44-25.357
Creating dir at /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-3/2020-04/23/14-44-25.357_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-3/2020-04/23/14-44-25.357_mkngff/1.1.zarr -> /idr0064/1.1.zarr
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-3/2020-04/23/14-44-25.357
write bfoptions to: /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-3/2020-04/23/14-44-25.357_mkngff/1.1.zarr.bfoptions
UPDATE 50
BEGIN
 mkngff_fileset 
----------------
        5289218
(1 row)

COMMIT
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-12/2020-04/23/14-46-43.199
Creating dir at /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-12/2020-04/23/14-46-43.199_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-12/2020-04/23/14-46-43.199_mkngff/1.2.zarr -> /idr0064/1.2.zarr
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-12/2020-04/23/14-46-43.199
write bfoptions to: /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Se....

Failed to view Image...

Caused by: java.io.FileNotFoundException: /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-3/2020-04/23/14-44-25.357_mkngff/1.1.ome.zarr/OME/METADATA.ome.xml (No such file or directory)

Fix symlinks manually...

rm /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-3/2020-04/23/14-44-25.357_mkngff/1.1.zarr
 ln -s /idr0064/1.1.ome.zarr /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-3/2020-04/23/14-44-25.357_mkngff/1.1.ome.zarr
will-moore commented 11 months ago
message = Error instantiating pixel buffer: /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-3/2020-04/23/14-44-25.357_mkngff/1.1.ome.zarr/OME/METADATA.ome.xml

}

mv /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-3/2020-04/23/14-44-25.357_mkngff/1.1.zarr.bfoptions /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-3/2020-04/23/14-44-25.357_mkngff/1.1.ome.zarr.bfoptions
will-moore commented 11 months ago

Fix symlinks and bfoptions...

for r in $(cat $IDRID.csv); do
  platename=$(echo $r | cut -d',' -f2)
  fsid=$(echo $r | cut -d',' -f3 | tr -d '[:space:]')
  omero mkngff symlink /data/OMERO/ManagedRepository $fsid "/idr0064/$platename.ome.zarr" --bfoptions
done

Manually viewed an Image from each plate to trigger memo file generation. All viewed OK.

Check pixels...

python check_pixels.py Screen:2351 --max-planes=sizeC > /tmp/check_pix_20231218_idr0064.log

...
447/450 Check Image:9822148 3.3 [Well G13, Field 1]
448/450 Check Image:9822149 3.3 [Well E12, Field 1]
449/450 Check Image:9822150 3.3 [Well C17, Field 1]
End: 2023-12-18 16:10:04.880256

grep Error /tmp/check_pix_20231218_idr0064.log
will-moore commented 11 months ago

Upload to BioStudies... from pilot-zarr1-dev...

$ ./ascp -P33001 -i ../etc/asperaweb_id_dsa.openssh -d /data/idr0064/idr0064 bsaspera_w@hx-fasp-1.ebi.ac.uk:5f/xxxxxx-xx-xxxx-xx
...
1.1.ome.zarr.zip                              100% 4836MB  295Mb/s    02:43    
1.2.ome.zarr.zip                              100% 5170MB  113Mb/s    05:34    
1.3.ome.zarr.zip                              100% 4998MB 91.0Mb/s    12:51    
2.1.ome.zarr.zip                              100% 4931MB  172Mb/s    20:32    
2.2.ome.zarr.zip                             100% 4921MB  259Mb/s    27:46    
2.3.ome.zarr.zip                             100% 5188MB  6.7Mb/s    35:12    
3.1.ome.zarr.zip                              100% 5008MB  229Mb/s    42:29    
3.2.ome.zarr.zip                              100% 4831MB  282Mb/s    49:21    
3.3.ome.zarr.zip                              100% 4827MB 55.6Mb/s    57:21 

https://www.ebi.ac.uk/biostudies/submissions/files?path=%2Fuser%2Fidr0064

will-moore commented 10 months ago

@francesw I wonder if you could create a new BioStudies submission for idr0064 as you've previously done for other studies? This one got missed from the whole process, so we're just catching up now. The idr0064_files.tsv and all the data are in place. Thanks.

francesw commented 10 months ago

Done (currently processing on BioStudies).

francesw commented 10 months ago

S-BIAD992

will-moore commented 10 months ago

Test update of NGFF filesets on idr0125-pilot since we now have data hosted on BIA s3...

pip install 'omero-mkngff @ git+https://github.com/will-moore/omero-mkngff@fs_suffix'

Need to manually update Fileset IDs...

idr0064/1.1.ome.zarr,S-BIAD992/ec0b496e-2d48-44ed-be4d-0339f8927eef,5289217
idr0064/1.2.ome.zarr,S-BIAD992/352f2cac-020d-494d-8e48-37a8f4e4f3f2,5289218
idr0064/1.3.ome.zarr,S-BIAD992/34948485-1bde-4da6-a4ca-c21e223092fc,5289219
idr0064/2.1.ome.zarr,S-BIAD992/641125ba-978c-488f-be0f-75b9bb09b6f6,5289220
idr0064/2.2.ome.zarr,S-BIAD992/4518e5a4-7804-4918-b42a-678ebd89ba0d,5289221
idr0064/2.3.ome.zarr,S-BIAD992/e40d09f6-152a-47c1-8986-a5f924d608f8,5289222
idr0064/3.1.ome.zarr,S-BIAD992/23d753d0-de0b-4f72-8b99-1fe39605cfc0,5289223
idr0064/3.2.ome.zarr,S-BIAD992/bf6dfdb8-d84c-47f6-a36a-11d1dddfe689,5289224
idr0064/3.3.ome.zarr,S-BIAD992/16d9a6cb-32d3-49a3-8654-9c869baa2394,5289225
for r in $(cat $IDRID.csv); do
  biapath=$(echo $r | cut -d',' -f2)
  uuid=$(echo $biapath | cut -d'/' -f2)
  fsid=$(echo $r | cut -d',' -f3 | tr -d '[:space:]')
  omero mkngff sql $fsid --fs_suffix=None --clientpath="https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/$biapath/$uuid.zarr" "/bia-integrator-data/$biapath/$uuid.zarr" > "$IDRID/$fsid.sql"
done

for i in $(ls); do sed -i 's/SECRETUUID/9630ba1e-ed3a-42e3-9296-59ccf23a7039/g' $i; done

for r in $(cat $IDRID.csv); do
  biapath=$(echo $r | cut -d',' -f2)
  uuid=$(echo $biapath | cut -d'/' -f2)
  fsid=$(echo $r | cut -d',' -f3 | tr -d '[:space:]')
  psql -U omero -d idr -h $DBHOST -f "$IDRID/$fsid.sql"
  omero mkngff symlink /data/OMERO/ManagedRepository $fsid "/bia-integrator-data/$biapath/$uuid.zarr" --fs_suffix=None --bfoptions
done
...
UPDATE 50
BEGIN
 mkngff_fileset 
----------------
        5289251
(1 row)

COMMIT
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-12/2020-04/23/15-00-09.126_mkngff
Creating dir at /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-12/2020-04/23/15-00-09.126_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-12/2020-04/23/15-00-09.126_mkngff/16d9a6cb-32d3-49a3-8654-9c869baa2394.zarr -> /bia-integrator-data/S-BIAD992/16d9a6cb-32d3-49a3-8654-9c869baa2394/16d9a6cb-32d3-49a3-8654-9c869baa2394.zarr
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-12/2020-04/23/15-00-09.126_mkngff
write bfoptions to: /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-12/2020-04/23/15-00-09.126_mkngff/16d9a6cb-32d3-49a3-8654-9c869baa2394.zarr.bfoptions
will-moore commented 5 months ago

Ooops - didn't update https://github.com/IDR/mkngff_upgrade_scripts/blob/main/ngff_filesets/idr0064.csv etc yet. Do it now, using idr-testing...

Manually update Fileset IDs here, based on current state of idr-testing:

idr0064.csv

idr0064/1.1.ome.zarr,S-BIAD992/ec0b496e-2d48-44ed-be4d-0339f8927eef,6321088
idr0064/1.2.ome.zarr,S-BIAD992/352f2cac-020d-494d-8e48-37a8f4e4f3f2,6321089
idr0064/1.3.ome.zarr,S-BIAD992/34948485-1bde-4da6-a4ca-c21e223092fc,6321090
idr0064/2.1.ome.zarr,S-BIAD992/641125ba-978c-488f-be0f-75b9bb09b6f6,6321091
idr0064/2.2.ome.zarr,S-BIAD992/4518e5a4-7804-4918-b42a-678ebd89ba0d,6321092
idr0064/2.3.ome.zarr,S-BIAD992/e40d09f6-152a-47c1-8986-a5f924d608f8,6321093
idr0064/3.1.ome.zarr,S-BIAD992/23d753d0-de0b-4f72-8b99-1fe39605cfc0,6321094
idr0064/3.2.ome.zarr,S-BIAD992/bf6dfdb8-d84c-47f6-a36a-11d1dddfe689,6321095
idr0064/3.3.ome.zarr,S-BIAD992/16d9a6cb-32d3-49a3-8654-9c869baa2394,6321096

NB: after some copy/paste errors from code above, I created symlinks with --fs_suffix=None but hadn't used this to create SQL scripts, so resulted in invalid Filesets/links.

Then I needed to use --fs_suffix=test in both cases to avoid duplicate OriginalFile entries!

This looks good!

I will delete the sql scripts since they are invalid. See https://github.com/IDR/mkngff_upgrade_scripts/commit/a15b483495e3dd525629b03cd598978b6c891df5 Simply need to generate SQL scripts at time of use (as just now) since this doesn't take long.