IDR / idr-metadata

Curated metadata for all studies published in the Image Data Resource
https://idr.openmicroscopy.org
14 stars 24 forks source link

broken svs on Bio-Formats update #691

Open will-moore opened 6 months ago

will-moore commented 6 months ago

When testing https://github.com/ome/omero-web/pull/536 on idr-testing, I noticed this issue which I think Seb mentioned was due to a Bio-Formats update (see below)

Deployed on idr-testing from branch at #537, this is returning different zoomLevelScaling from previously!? On https://idr.openmicroscopy.org/webclient/img_detail/9840218/ that image has:

    "tiles": true,
    "tile_size": {
        "width": 240,
        "height": 240
    },
    "levels": 3,
    "zoomLevelScaling": {
        "0": 1.0,
        "1": 0.2499840367792606,
        "2": 0.029308473277568484
    },

But on idr-testing, a different zoomLevelScaling is causing problems:

Screenshot 2024-02-28 at 13 40 52

From slack idr, 28th Feb Seb: "SVS in particular is a format which went through several changes in recent versions"

Will: Ah - so that bug may not be due to my PR?

Seb: 14:30 where that bug == the resolution levels are different between deployments or that bug == the resolution levels on idr-testing are incorrect? 14:32 actually the bug is clearly in production IDR (and has been addressed in recent Bio-Formats versions) 14:33 https://idr.openmicroscopy.org/webclient/img_detail/9840219/?dataset=10450 is absolutely not a label image, it's a resolution level of the whole slide image 14:34 so this dataset will partly break as part of the OMERO.server upgrade and we might need to consider reimporting these files unfortunately

will-moore commented 6 months ago

As discussed in IDR meeting:

will-moore commented 6 months ago

Find 'SVS' format and count images...

idr=> select * from format where value='SVS';
 id  | permissions | value | external_id 
-----+-------------+-------+-------------
 360 |         -52 | SVS   |            
(1 row)

idr=> select count(*) from image where format=360 and series=1;
 count 
-------
   556
(1 row)

Find Datasets...

idr=> select DISTINCT parent from DatasetImageLink where child in (select id from image where format=360 and series=1);
 parent 
--------
  10348
  10349
  10352
  10354
  10356
  10358
  10388
  10413
  10414
  10415
  10425
  10426
  10427
  10428
  10429
  10431
  10436
  10450
  10461
  10474
  10475
  10476
  10478
  10479
  10480
  10481
  10484
  10485
  10492
  10517
  10530
  10550
  10557
  10558
  10559
  10560
  15274
  15280
  16651
  16652
  16653
  16654
(42 rows)
will-moore commented 6 months ago

Found 397 Broken images:

idr0070

Dataset: 10348, 17 Images https://idr-testing.openmicroscopy.org/webclient/?show=image-9839914|image-9839917|image-9839920|image-9839923|image-9839926|image-9839929|image-9839932|image-9839935|image-9839938|image-9839941|image-9839947|image-9839950|image-9839953|image-9839956|image-9839959|image-9839962|image-9839965

Dataset: 10349, 21 Images https://idr-testing.openmicroscopy.org/webclient/?show=image-9839970|image-9839973|image-9839976|image-9839979|image-9839982|image-9839985|image-9839988|image-9839991|image-9839994|image-9839997|image-9840000|image-9840005|image-9840017|image-9840020|image-9840023|image-9840026|image-9840029|image-9840032|image-9840035|image-9840038|image-9840041

Dataset: 10352, 18 Images https://idr-testing.openmicroscopy.org/webclient/?show=image-9840066|image-9840069|image-9840072|image-9840075|image-9840078|image-9840081|image-9840084|image-9840087|image-9840090|image-9840093|image-9840096|image-9840099|image-9840102|image-9840105|image-9840108|image-9840111|image-9840114|image-9840117

Dataset: 10354, 12 Images https://idr-testing.openmicroscopy.org/webclient/?show=image-9839873|image-9839876|image-9839879|image-9839882|image-9839885|image-9839888|image-9839891|image-9839894|image-9839897|image-9839900|image-9839903|image-9839906

Dataset: 10356, 15 Images https://idr-testing.openmicroscopy.org/webclient/?show=image-9839773|image-9839776|image-9839779|image-9839782|image-9839785|image-9839788|image-9839791|image-9839794|image-9839797|image-9839800|image-9839803|image-9839806|image-9839809|image-9839812|image-9839815

Dataset: 10358, 14 Images https://idr-testing.openmicroscopy.org/webclient/?show=image-9839825|image-9839828|image-9839831|image-9839836|image-9839839|image-9839842|image-9839845|image-9839848|image-9839851|image-9839854|image-9839857|image-9839860|image-9839863|image-9839866

Dataset: 10388, 3 Images https://idr-testing.openmicroscopy.org/webclient/?show=image-9838978|image-9838981|image-9838984

Dataset: 10413, 2 Images https://idr-testing.openmicroscopy.org/webclient/?show=image-9839274|image-9839277

Dataset: 10414, 3 Images https://idr-testing.openmicroscopy.org/webclient/?show=image-9839283|image-9839286|image-9839289

Dataset: 10415, 1 Image https://idr-testing.openmicroscopy.org/webclient/?show=image-9839294

Dataset: 10425, 1 Image https://idr-testing.openmicroscopy.org/webclient/?show=image-9839412

Dataset: 10426, 2 Images https://idr-testing.openmicroscopy.org/webclient/?show=image-9839420|image-9839423

Dataset: 10427, 2 Images https://idr-testing.openmicroscopy.org/webclient/?show=image-9839426|image-9839429

Dataset: 10428, 18 Images https://idr-testing.openmicroscopy.org/webclient/?show=image-9839432|image-9839435|image-9839438|image-9839441|image-9839444|image-9839447|image-9839450|image-9839453|image-9839456|image-9839459|image-9839462|image-9839465|image-9839468|image-9839471|image-9839474|image-9839477|image-9839480|image-9839483

Dataset: 10429, 10 Images https://idr-testing.openmicroscopy.org/webclient/?show=image-9839486|image-9839489|image-9839492|image-9839495|image-9839498|image-9839501|image-9839504|image-9839507|image-9839510|image-9839513

Dataset: 10431, 24 Images https://idr-testing.openmicroscopy.org/webclient/?show=image-9839567|image-9839570|image-9839573|image-9839576|image-9839579|image-9839582|image-9839585|image-9839588|image-9839591|image-9839594|image-9839597|image-9839600|image-9839603|image-9839606|image-9839609|image-9839612|image-9839615|image-9839618|image-9839621|image-9839624|image-9839628|image-9839633|image-9839637|image-9839640

Dataset: 10436, 8 Images https://idr-testing.openmicroscopy.org/webclient/?show=image-9839671|image-9839674|image-9839677|image-9839680|image-9839683|image-9839686|image-9839689|image-9839692

https://idr-testing.openmicroscopy.org/webclient/?show=image-9840219

https://idr-testing.openmicroscopy.org/webclient/?show=image-9840355|image-9840358|image-9840361

https://idr-testing.openmicroscopy.org/webclient/?show=image-9840478|image-9840481|image-9840484|image-9840487|image-9840490|image-9840493|image-9840496|image-9840499|image-9840502|image-9840505|image-9840508|image-9840511|image-9840514|image-9840517|image-9840523|image-9840526|image-9840536|image-9840539|image-9840542|image-9840545

https://idr-testing.openmicroscopy.org/webclient/?show=image-9840550|image-9840553|image-9840556|image-9840559|image-9840562|image-9840565|image-9840568|image-9840571

https://idr-testing.openmicroscopy.org/webclient/?show=image-9840579|image-9840582|image-9840585|image-9840588

10478 https://idr-testing.openmicroscopy.org/webclient/?show=image-9840625|image-9840628|image-9840631|image-9840634|image-9840637|image-9840640|image-9840643|image-9840646|image-9840649|image-9840652|image-9840655|image-9840658|image-9840661|image-9840664|image-9840667|image-9840670|image-9840673|image-9840676|image-9840679|image-9840683|image-9840685|image-9840688|image-9840691|image-9840694|image-9840697|image-9840700|image-9840703|image-9840706|image-9840709|image-9840712|image-9840715|image-9840718|image-9840721|image-9840724|image-9840727|image-9840730|image-9840733|image-9840736

10479 https://idr-testing.openmicroscopy.org/webclient/?show=image-9840740|image-9840743|image-9840746|image-9840749|image-9840752|image-9840755|image-9840758|image-9840761|image-9840764|image-9840767|image-9840770|image-9840773|image-9840776|image-9840779|image-9840782|image-9840785|image-9840788|image-9840791|image-9840794|image-9840797|image-9840800|image-9840803|image-9840806|image-9840809|image-9840812|image-9840815|image-9840818|image-9840821|image-9840824|image-9840827|image-9840830|image-9840833

10480 https://idr-testing.openmicroscopy.org/webclient/?show=image-9840845|image-9840848|image-9840851|image-9840854|image-9840857|image-9840860|image-9840863|image-9840866|image-9840869|image-9840872|image-9840875|image-9840878|image-9840881|image-9840884|image-9840887|image-9840890|image-9840893|image-9840896|image-9840899|image-9840902|image-9840905

10481 https://idr-testing.openmicroscopy.org/webclient/?show=image-9840909|image-9840912|image-9840915|image-9840918|image-9840921|image-9840924|image-9840927|image-9840930|image-9840933|image-9840936|image-9840939|image-9840942|image-9840945|image-9840948|image-9840951|image-9840954|image-9840957|image-9840960|image-9840963|image-9840966|image-9840969|image-9840972|image-9840975|image-9840978|image-9840981|image-9840984|image-9840987|image-9840990|image-9840993|image-9840996|image-9840999|image-9841002|image-9841005|image-9841008|image-9841011|image-9841014|image-9841017

10484 https://idr-testing.openmicroscopy.org/webclient/?show=image-9841070|image-9841073|image-9841076|image-9841079|image-9841082|image-9841085|image-9841088|image-9841091|image-9841094|image-9841169|image-9841172|image-9841175

10485 https://idr-testing.openmicroscopy.org/webclient/?show=image-9841097|image-9841100|image-9841103|image-9841106|image-9841109|image-9841112|image-9841115|image-9841118|image-9841121|image-9841124|image-9841127|image-9841130|image-9841133|image-9841136|image-9841139|image-9841142|image-9841145|image-9841148|image-9841151|image-9841154|image-9841157|image-9841163|image-9841166

https://idr-testing.openmicroscopy.org/webclient/?show=image-9841222|image-9841225|image-9841228

https://idr-testing.openmicroscopy.org/webclient/?show=image-9841448|image-9841451|image-9841454

https://idr-testing.openmicroscopy.org/webclient/?show=image-9841528

https://idr-testing.openmicroscopy.org/webclient/?show=image-9841712|image-9841715|image-9841718|image-9841721|image-9841724|image-9841727

10557 - None!

10558 https://idr-testing.openmicroscopy.org/webclient/?show=image-9841778|image-9841781|image-9841784|image-9841787

https://idr-testing.openmicroscopy.org/webclient/?show=image-9841791|image-9841794|image-9841797|image-9841800|image-9841803|image-9841806|image-9841809|image-9841812

https://idr-testing.openmicroscopy.org/webclient/?show=image-9841815|image-9841818

idr0114

15274, 15280 - No svs!

idr0135

16651, 16652, 16653, 16654 - All labels OK

will-moore commented 6 months ago

All the svs above are from idr0070. The https://github.com/IDR/idr0070-kerwin-hdbr/blob/master/experimentA/idr0070-experimentA-filePaths.tsv has 400 svs images, so let's just re-import all of them.

Steps:

will-moore commented 6 months ago

To focus on idr0070, max Image ID is 9841818, so we can count etc...

$ psql -U omero -d idr -h $DBHOST -F "," -c "select count(id) from image where format=360 and series=1 and (id < 9841819)"
 count 
-------
   400
(1 row)

400 svs Image IDs in idr0070


9838978,9838981,9838984,9839274,9839277,9839283,9839286,9839289,9839293,9839412,9839420,9839423,9839426,9839429,9839432,9839435,9839438,9839441,9839444,9839447,9839450,9839453,9839456,9839459,9839462,9839465,9839468,9839471,9839474,9839477,9839480,9839483,9839486,9839489,9839492,9839495,9839498,9839501,9839504,9839507,9839510,9839513,9839567,9839570,9839573,9839576,9839579,9839582,9839585,9839588,9839591,9839594,9839597,9839600,9839603,9839606,9839609,9839612,9839615,9839618,9839621,9839624,9839628,9839633,9839637,9839640,9839671,9839674,9839677,9839680,9839683,9839686,9839689,9839692,9839773,9839776,9839779,9839782,9839785,9839788,9839791,9839794,9839797,9839800,9839803,9839806,9839809,9839812,9839815,9839825,9839828,9839831,9839836,9839839,9839842,9839845,9839848,9839851,9839854,9839857,9839860,9839863,9839866,9839873,9839876,9839879,9839882,9839885,9839888,9839891,9839894,9839897,9839900,9839903,9839906,9839914,9839917,9839920,9839923,9839926,9839929,9839932,9839935,9839938,9839941,9839947,9839950,9839953,9839956,9839959,9839962,9839965,9839970,9839973,9839976,9839979,9839982,9839985,9839988,9839991,9839994,9839997,9840000,9840005,9840017,9840020,9840023,9840026,9840029,9840032,9840035,9840038,9840041,9840066,9840069,9840072,9840075,9840078,9840081,9840084,9840087,9840090,9840093,9840096,9840099,9840102,9840105,9840108,9840111,9840114,9840117,9840219,9840355,9840358,9840361,9840478,9840481,9840484,9840487,9840490,9840493,9840496,9840499,9840502,9840505,9840508,9840511,9840514,9840517,9840523,9840526,9840536,9840539,9840542,9840545,9840550,9840553,9840556,9840559,9840562,9840565,9840568,9840571,9840574,9840579,9840582,9840585,9840588,9840625,9840628,9840631,9840634,9840637,9840640,9840643,9840646,9840649,9840652,9840655,9840658,9840661,9840664,9840667,9840670,9840673,9840676,9840679,9840682,9840685,9840688,9840691,9840694,9840697,9840700,9840703,9840706,9840709,9840712,9840715,9840718,9840721,9840724,9840727,9840730,9840733,9840736,9840740,9840743,9840746,9840749,9840752,9840755,9840758,9840761,9840764,9840767,9840770,9840773,9840776,9840779,9840782,9840785,9840788,9840791,9840794,9840797,9840800,9840803,9840806,9840809,9840812,9840815,9840818,9840821,9840824,9840827,9840830,9840833,9840845,9840848,9840851,9840854,9840857,9840860,9840863,9840866,9840869,9840872,9840875,9840878,9840881,9840884,9840887,9840890,9840893,9840896,9840899,9840902,9840905,9840909,9840912,9840915,9840918,9840921,9840924,9840927,9840930,9840933,9840936,9840939,9840942,9840945,9840948,9840951,9840954,9840957,9840960,9840963,9840966,9840969,9840972,9840975,9840978,9840981,9840984,9840987,9840990,9840993,9840996,9840999,9841002,9841005,9841008,9841011,9841014,9841017,9841070,9841073,9841076,9841079,9841082,9841085,9841088,9841091,9841094,9841097,9841100,9841103,9841106,9841109,9841112,9841115,9841118,9841121,9841124,9841127,9841130,9841133,9841136,9841139,9841142,9841145,9841148,9841151,9841154,9841157,9841163,9841166,9841169,9841172,9841175,9841222,9841225,9841228,9841448,9841451,9841454,9841528,9841712,9841715,9841718,9841721,9841724,9841727,9841772,9841775,9841778,9841781,9841784,9841787,9841791,9841794,9841797,9841800,9841803,9841806,9841809,9841812,9841815,9841818
will-moore commented 6 months ago

Test deleting even a single Image fails (hangs) due to Annotations. First remove all annotations...

$  omero metadata populate --context deletemap --report --wait 300 --batch 100 --localcfg '{"ns":["openmicroscopy.org/mapr/organism", "openmicroscopy.org/mapr/antibody", "openmicroscopy.org/mapr/gene", "openmicroscopy.org/mapr/cell_line", "openmicroscopy.org/mapr/phenotype", "openmicroscopy.org/mapr/sirna", "openmicroscopy.org/mapr/compound", "openmicroscopy.org/mapr/protein"], "typesToIgnore":["Annotation"]}' --cfg experimentA/idr0070-experimentA-bulkmap-config.yml Project:1104

$ omero metadata populate --context deletemap --report --wait 300 --batch 100 --cfg experimentA/idr0070-experimentA-bulkmap-config.yml Project:1104

$ python /uod/idr/metadata/idr-utils/scripts/annotate/clean_orphaned_maps.py
INFO:omero.util.Resources:Starting
INFO:omero.util.Resources:Starting
INFO:omero.util.Resources:Halted
INFO:root:Found 0 orphaned Organism maps
INFO:root:Found 74 orphaned Antibody maps
INFO:root:Deleting 74 maps
INFO:root:Found 2 orphaned Gene maps
INFO:root:Deleting 2 maps
INFO:root:Found 0 orphaned Cell Line maps
INFO:root:Found 0 orphaned Phenotype maps
INFO:root:Found 0 orphaned siRNA maps
INFO:root:Found 0 orphaned Compound maps
INFO:root:Found 0 orphaned Protein maps
INFO:root:Found 0 orphaned Notebook maps
INFO:root:Found 0 orphaned Study Info maps
INFO:root:Found 0 orphaned Study Components maps
INFO:omero.util.Resources:Halted

Can't delete using a single Image:ID:

$ omero delete --dry-run --report Image:9838978
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
omero.cmd.Delete2 Image:9838978 failed: 'graph-fail'
failed: within Fileset[3550638] may not delete Image[9838978] while Image[9838979] remains
Steps: 4
Elapsed time: 0.844 secs.
Flags: [FAILURE, CANCELLED]

Need to use Fileset ID:

$ omero delete --dry-run --report Fileset:3550638
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
omero.cmd.Delete2 Fileset:3550638 Dry run performed
ok
Steps: 4
Elapsed time: 5.382 secs.
Flags: []
Deleted objects
  Instrument:93241
  Objective:90855
  ObjectiveSettings:91280-91282
  CommentAnnotation:25123922
  FilesetAnnotationLink:3550438
  DatasetImageLink:4540977-4540979
  Channel:26299350-26299358
  Image:9838977-9838979
  LogicalChannel:10609300-10609308
  OriginalFile:29671051,29671052
  Pixels:9838977-9838979
  ChannelBinding:25422660-25422668
  QuantumDef:9523332-9523334
  RenderingDef:9523332-9523334
  Thumbnail:11390582-11390584
  Fileset:3550638
  FilesetEntry:22140458
  FilesetJobLink:14306936-14306940
  IndexingJob:14317104
  JobOriginalFileLink:3568052
  MetadataImportJob:14317101
  PixelDataJob:14317102
  ThumbnailGenerationJob:14317103
  UploadJob:14317100

Max Fileset ID for idr0070 is 3552155

psql -U omero -d idr -h $DBHOST -c "select count(fileset) from image where format=360 and series=1 and (fileset < 3552156)"
 count 
-------
   400
(1 row)

Get Fileset IDs....

psql -U omero -d idr -h $DBHOST -c "select fileset from image where format=360 and series=1 and (fileset < 3552156)" >> idr0070_svs_filesetIds.txt

Need to remove whitespace from each line:

$ for r in $(cat idr0070_svs_filesetIds.txt); do echo $r | tr -d '[:space:]' >> idr0070_svs_filesetIds.csv && echo "" >> idr0070_svs_filesetIds.csv; done

Do the delete....

for r in $(cat idr0070_svs_filesetIds.csv); do omero delete --report Fileset:$r; done
will-moore commented 6 months ago

Now re-import....

[wmoore@test120-omeroreadwrite ~]$ cd /uod/idr/metadata/idr0070-kerwin-hdbr/experimentA/

sudo cat idr0070-experimentA-filePaths.tsv | grep svs >> /tmp/idr0070-experimentA-filePaths_svs.tsv
sudo mv /tmp/idr0070-experimentA-filePaths_svs.tsv ./
# update bulk.yml to point at the ...svs.tsv
sudo vi idr0070-experimentA-bulk.yml

wc idr0070-experimentA-filePaths_svs.tsv
  402  1400 95150 idr0070-experimentA-filePaths_svs.tsv

As omero-server...

omero import --bulk experimentA/idr0070-experimentA-bulk.yml --file /tmp/idr0070svs.log  --errs /tmp/idr0070svs.err

On completion... 2 svs not imported because original wasn't deleted...

$ grep ClientPathExclusion /tmp/idr0070svs.err
2024-03-27 16:44:56,417 6461       [2-thread-1] INFO   .importer.exclusions.ClientPathExclusion - ClientPath match for filename: uod/idr/filesets/idr0070-kerwin-hdbr/20200414-Batch3-ftp/HDBR_SYP_IHC/1605,2,Rt hemi,10pcw,24_SYP.svs
2024-03-27 16:46:04,488 6406       [2-thread-1] INFO   .importer.exclusions.ClientPathExclusion - ClientPath match for filename: uod/idr/filesets/idr0070-kerwin-hdbr/20200422-Batch5/HDBR_MKI67_IHC/1605,2,Rt hemi,10pcw,24_KI67.svs

All files were either imported as 2 (or 1) image - none with 3 images as before:

[wmoore@test120-omeroreadwrite ~]$ grep "1 file uploaded" /tmp/idr0070svs.err | wc
    400    5200   31598
[wmoore@test120-omeroreadwrite ~]$ grep "2 images imported" /tmp/idr0070svs.err | wc
    391    5083   30889
[wmoore@test120-omeroreadwrite ~]$ grep "1 image imported" /tmp/idr0070svs.err | wc
      2      26     156
will-moore commented 5 months ago

Re-annotate....

Remove label rows from csv on idr-testing...

[wmoore@test120-omeroreadwrite experimentA]$ sudo cat idr0070-experimentA-annotation.csv | grep -v label > /tmp/idr0070-experimentA-annotation2.csv
[wmoore@test120-omeroreadwrite experimentA]$ sudo rm idr0070-experimentA-annotation.csv
[wmoore@test120-omeroreadwrite experimentA]$ sudo mv /tmp/idr0070-experimentA-annotation2.csv idr0070-experimentA-annotation.csv

Reannotate as normal... check_annotations....

/opt/omero/server/venv3/bin/python /uod/idr/metadata/idr-utils/scripts/annotate/check_annotations.py Project:1104 idr0070-experimentA-annotation.csv --output /tmp/errors.csv

GAP43-10PCW,"HDBR_GAP43_IHC_10PCW/1634,6,Sl 2,10pcw,37_GAP43.svs",,,,,,,,,,,,,,,,,,,,Missing annotation
GAP43-12PCW,"HDBR_GAP43_IHC_12PCW/11761,12,Sl 3,12pcw,140_GAP43.svs [label image]",,,,,,,,,,,,,,,,,,,,Missing annotation
GAP43-CS18,"HDBR_GAP43_IHC_CS18/1262,2,Embryo_Placenta,CS18,165_GAP43.svs [label image]",,,,,,,,,,,,,,,,,,,,Missing annotation
MKI67-10PCW,"HDBR_MKI67_IHC/1634,6,Sl 2,10pcw,37_KI67.svs",,,,,,,,,,,,,,,,,,,,Missing annotation
NKX2-2-CS19,"HDBR_NKX2-2_IHC_CS19/352,1,Head,CS19,284_NKX2-2.svs [label image]",,,,,,,,,,,,,,,,,,,,Missing annotation
PAX6-12PCW,"HDBR_PAX6_IHC_hires_12PCW/11610,9,brain,12pcw,114_PAX6.svs [label image]",,,,,,,,,,,,,,,,,,,,Missing annotation
PAX6-CS19,"HDBR_PAX6_IHC_hires_CS19/352,1,Head,CS19,284_PAX6.svs [label image]",,,,,,,,,,,,,,,,,,,,Missing annotation
WNT8B-12PCW,"HDBR_WNT8B_IHC_12PCW/11610,9,brain,12pcw,110_2014-05-20 14_35_30_wnt8b.svs [label image]",,,,,,,,,,,,,,,,,,,,Missing annotation
WNT8B-12PCW,"HDBR_WNT8B_IHC_12PCW/11610,9,brain,12pcw,160_2014-05-20 14_17_11_wnt8b.svs [label image]",,,,,,,,,,,,,,,,,,,,Missing annotation
will-moore commented 5 months ago

Missing annotation errors from above are fixed in https://github.com/IDR/idr0070-kerwin-hdbr/pull/2/commits/4b08e7f99036f34cdd02ddb1f8bcf7ba6c885c3e as follows:

Checked all 7 of the [label image] images above and they are all working. So these are genuine label images and should be added back to the annotations.csv. The other 2 Missing annotation errors are from 2 images that are now just a single image (no [. ] in the name) so these were re-named in the annotations.csv.

will-moore commented 5 months ago

Fixed - test:

(venv3) bash-4.2$ /opt/omero/server/venv3/bin/python /uod/idr/metadata/idr-utils/scripts/annotate/check_annotations.py Project:1104 idr0070-experimentA-annotation.csv --output /tmp/idr0070_errors.csv
All images are unique and have annotations.

Deleted previous OMERO.table:

(venv3) bash-4.2$ omero delete --report Annotation:25135328
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
omero.cmd.Delete2 Annotation:25135328 ok
Steps: 6
Elapsed time: 0.248 secs.
Flags: []
Deleted objects
  FileAnnotation:25135328
  ProjectAnnotationLink:1667
  OriginalFile:29676048

Create OMERO.table

/opt/omero/server/OMERO.server/bin/omero metadata populate --report --batch 1000 --file idr0070-experimentA-annotation.csv Project:1104

Create Map annotations

/opt/omero/server/OMERO.server/bin/omero metadata populate --context bulkmap --batch 100 --cfg idr0070-experimentA-bulkmap-config.yml Project:1104
will-moore commented 5 months ago

The 3 Filesets in https://idr-testing.openmicroscopy.org/webclient/?show=dataset-10461 are still broken. These weren't deleted & reimported for some reason...?

Fileset IDs are: 3551354, 3551355, 3551356

These are all found in idr0070_svs_filesetIds.csv above. So delete was attempted above. Try again:

(venv3) bash-4.2$ omero delete Fileset:3551354 --report
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
omero.cmd.Delete2 Fileset:3551354 failed: 'graph-fail'
failed: cannot read ome.model.fs.Fileset[3551354]
Steps: 6
Elapsed time: 0.032 secs.
Flags: [FAILURE, CANCELLED]

Also, none of these Filesets or Image IDs can be found in the DB via psql. Using the Dataset ID, can see it contains different Images:

idr=> select * from DatasetImageLink where parent=10461;
   id    | permissions | version |  child   | creation_id | external_id | group_id | owner_id | update_id | parent 
---------+-------------+---------+----------+-------------+-------------+----------+----------+-----------+--------
 7310578 |         -56 |       0 | 15153328 |   389349671 |             |        3 |        2 | 389349671 |  10461
 7310579 |         -56 |       0 | 15153329 |   389349671 |             |        3 |        2 | 389349671 |  10461
 7310580 |         -56 |       0 | 15153330 |   389349700 |             |        3 |        2 | 389349700 |  10461
 7310581 |         -56 |       0 | 15153331 |   389349700 |             |        3 |        2 | 389349700 |  10461
 7310582 |         -56 |       0 | 15153332 |   389349731 |             |        3 |        2 | 389349731 |  10461
 7310583 |         -56 |       0 | 15153333 |   389349731 |             |        3 |        2 | 389349731 |  10461
(6 rows)

It turns out that the images, thumbnails, right panel etc showing in the webclient above were all cached in the browser or nginx and have all been deleted and replaced successfully!

will-moore commented 5 months ago

cc @francesw or @dominikl

I wonder if you could review the actions above that I have performed on idr-testing (starting at https://github.com/IDR/idr-metadata/issues/691#issuecomment-2023174883 above) and plan to apply to idr-next for the NGFF release?

Summary:

You can see an example of the changes by looking at e.g. https://idr.openmicroscopy.org/webclient/?show=dataset-10461 (and the equivalent Dataset on idr testing)

will-moore commented 5 months ago

Final test on pilot-idrngff...

$ cd /uod/idr/metadata/idr0070-kerwin-hdbr/
$  omero metadata populate --context deletemap --report --wait 300 --batch 100 --localcfg '{"ns":["openmicroscopy.org/mapr/organism", "openmicroscopy.org/mapr/antibody", "openmicroscopy.org/mapr/gene", "openmicroscopy.org/mapr/cell_line", "openmicroscopy.org/mapr/phenotype", "openmicroscopy.org/mapr/sirna", "openmicroscopy.org/mapr/compound", "openmicroscopy.org/mapr/protein"], "typesToIgnore":["Annotation"]}' --cfg experimentA/idr0070-experimentA-bulkmap-config.yml Project:1104

ConnectionTimout! omero login -> internal server error

Restarted server and tried command above again... Success... Then...

$ omero metadata populate --context deletemap --report --wait 300 --batch 100 --cfg experimentA/idr0070-experimentA-bulkmap-config.yml Project:1104

$ python /uod/idr/metadata/idr-utils/scripts/annotate/clean_orphaned_maps.py
...
INFO:root:Found 1 orphaned Organism maps
INFO:root:Deleting 1 maps
INFO:root:Found 74 orphaned Antibody maps
INFO:root:Deleting 74 maps
INFO:root:Found 2 orphaned Gene maps
INFO:root:Deleting 2 maps
INFO:root:Found 0 orphaned Cell Line maps
INFO:root:Found 0 orphaned Phenotype maps
INFO:root:Found 0 orphaned siRNA maps
INFO:root:Found 0 orphaned Compound maps
INFO:root:Found 0 orphaned Protein maps
INFO:root:Found 0 orphaned Notebook maps
INFO:root:Found 0 orphaned Study Info maps
INFO:root:Found 0 orphaned Study Components maps

psql -U omero -d idr -h $DBHOST -c "select fileset from image where format=360 and series=1 and (fileset < 3552156)" >> idr0070_svs_filesetIds.txt

$ for r in $(cat idr0070_svs_filesetIds.txt); do echo $r | tr -d '[:space:]' >> idr0070_svs_filesetIds.csv && echo "" >> idr0070_svs_filesetIds.csv; done

# remove first and last (non-ID) rows
vi idr0070_svs_filesetIds.csv

(venv3) bash-5.1$ wc !$
wc idr0070_svs_filesetIds.csv
 400  400 3200 idr0070_svs_filesetIds.csv

omero login

# start 11:09...
for r in $(cat idr0070_svs_filesetIds.csv); do echo $r && omero delete Fileset:$r; done

#... approx 20 mins

As wmoore, update filePaths.tsv and annotations.csv...

[wmoore@test120-omeroreadwrite ~]$ cd /uod/idr/metadata/idr0070-kerwin-hdbr/experimentA/

sudo cat idr0070-experimentA-filePaths.tsv | grep svs >> /tmp/idr0070-experimentA-filePaths_svs.tsv
sudo mv /tmp/idr0070-experimentA-filePaths_svs.tsv ./
# update bulk.yml to point at the ...svs.tsv
sudo vi idr0070-experimentA-bulk.yml

wc idr0070-experimentA-filePaths_svs.tsv
  402  1400 95150 idr0070-experimentA-filePaths_svs.tsv

$ sudo -Es git remote add will https://github.com/will-moore/idr0070-kerwin-hdbr
$ sudo -Es git fetch will
$ sudo -Es git checkout will/label_images_removal

As omero-server, re-import...

11:32....

omero import --bulk experimentA/idr0070-experimentA-bulk.yml --file /tmp/idr0070svs.log  --errs /tmp/idr0070svs.err
dominikl commented 5 months ago

👍 Looks good to me! (Edit: on idr-testing)

will-moore commented 5 months ago

Import on pilot-idrngff above is failing with...

2024-04-25 10:33:34,114 4435       [      main] INFO          ome.formats.importer.ImportConfig - OMERO.blitz Version: 5.7.1
2024-04-25 10:33:34,288 4609       [      main] INFO          ome.formats.importer.ImportConfig - Bioformats version: 7.0.0 revision: 3f8b3326cb578d59bd948fb84c838ff77e9f1b08 date: 1 August 2023
2024-04-25 10:33:34,693 5014       [      main] INFO   formats.importer.cli.CommandLineImporter - Setting checksum algorithm to File-Size-64
2024-04-25 10:33:34,694 5015       [      main] INFO   formats.importer.cli.CommandLineImporter - Skipping minimum/maximum computation
2024-04-25 10:33:34,694 5015       [      main] INFO   formats.importer.cli.CommandLineImporter - Setting transfer to ln_s
2024-04-25 10:33:34,697 5018       [      main] INFO   formats.importer.cli.CommandLineImporter - Adding exclusion: clientpath
2024-04-25 10:33:35,087 5408       [      main] INFO   formats.importer.cli.CommandLineImporter - Setting parallel upload: 8
2024-04-25 10:33:35,088 5409       [      main] INFO   formats.importer.cli.CommandLineImporter - Log levels -- Bio-Formats: ERROR OMERO.importer: INFO
2024-04-25 10:33:42,061 12382      [      main] INFO      ome.formats.importer.ImportCandidates - Depth: 4 Metadata Level: MINIMUM
2024-04-25 10:33:43,438 13759      [      main] ERROR     ome.formats.importer.cli.ErrorHandler - FILE_EXCEPTION: /uod/idr/filesets/idr0070-kerwin-hdbr/20200414-Batch3-ftp/HDBR_SYP_IHC/11610,9,brain,12pcw,97_SYP.svs
java.lang.Exception: java.lang.NoSuchMethodError: 'java.lang.Object loci.formats.CoreMetadataList.remove(int, int)'
        at ome.formats.importer.ImportCandidates.singleFile(ImportCandidates.java:469)
        at ome.formats.importer.ImportCandidates.handleFile(ImportCandidates.java:576)
        at ome.formats.importer.ImportCandidates.execute(ImportCandidates.java:384)
        at ome.formats.importer.ImportCandidates.<init>(ImportCandidates.java:222)
        at ome.formats.importer.ImportCandidates.<init>(ImportCandidates.java:174)
        at ome.formats.importer.cli.CommandLineImporter.<init>(CommandLineImporter.java:148)
        at ome.formats.importer.cli.CommandLineImporter.main(CommandLineImporter.java:1021)
Caused by: java.lang.NoSuchMethodError: 'java.lang.Object loci.formats.CoreMetadataList.remove(int, int)'
        at loci.formats.in.SVSReader.initStandardMetadata(SVSReader.java:652)
        at loci.formats.in.BaseTiffReader.initMetadata(BaseTiffReader.java:98)
        at loci.formats.in.BaseTiffReader.initFile(BaseTiffReader.java:610)
        at loci.formats.FormatReader.setId(FormatReader.java:1466)
        at loci.formats.ImageReader.setId(ImageReader.java:863)
        at ome.formats.importer.OMEROWrapper$4.setId(OMEROWrapper.java:167)
        at loci.formats.ReaderWrapper.setId(ReaderWrapper.java:660)
        at loci.formats.ChannelFiller.setId(ChannelFiller.java:234)
        at loci.formats.ReaderWrapper.setId(ReaderWrapper.java:660)
        at loci.formats.ChannelSeparator.setId(ChannelSeparator.java:293)
        at loci.formats.ReaderWrapper.setId(ReaderWrapper.java:660)
        at loci.formats.Memoizer.setId(Memoizer.java:698)
        at loci.formats.ReaderWrapper.setId(ReaderWrapper.java:660)
        at ome.formats.importer.ImportCandidates.singleFile(ImportCandidates.java:427)
        ... 6 common frames omitted
2024-04-25 10:33:43,440 13761      [      main] INFO      ome.formats.importer.ImportCandidates - 1 file(s) parsed into 0 group(s) with 1 call(s) to setId in 1336ms. (1380ms total) [0 unknowns]
2024-04-25 10:33:47,596 17917      [      main] INFO       ome.formats.OMEROMetadataStoreClient - Attempting initial SSL connection to localhost:4064
2024-04-25 10:33:55,775 26096      [      main] INFO       ome.formats.OMEROMetadataStoreClient - Insecure connection requested, falling back
2024-04-25 10:34:00,118 30439      [      main] INFO       ome.formats.OMEROMetadataStoreClient - Pinging session every 300s.
2024-04-25 10:34:00,135 30456      [      main] INFO       ome.formats.OMEROMetadataStoreClient - Server: 5.6.9
2024-04-25 10:34:00,136 30457      [      main] INFO       ome.formats.OMEROMetadataStoreClient - Client: 5.7.1
2024-04-25 10:34:00,136 30457      [      main] INFO       ome.formats.OMEROMetadataStoreClient - Java Version: 11.0.22
2024-04-25 10:34:00,136 30457      [      main] INFO       ome.formats.OMEROMetadataStoreClient - OS Name: Linux
2024-04-25 10:34:00,136 30457      [      main] INFO       ome.formats.OMEROMetadataStoreClient - OS Arch: amd64
2024-04-25 10:34:00,136 30457      [      main] INFO       ome.formats.OMEROMetadataStoreClient - OS Version: 5.14.0-362.24.1.el9_3.0.1.x86_64
No imports due to errors!
will-moore commented 5 months ago

Try import again.. after updating jars

omero import --bulk experimentA/idr0070-experimentA-bulk.yml --file /tmp/idr0070svs_2.log  --errs /tmp/idr0070svs_2.err

Seems to be working!!

will-moore commented 5 months ago

Import complete... Checkout the updated annotations.csv from https://github.com/IDR/idr0070-kerwin-hdbr/pull/2

(base) [wmoore@pilot-idrngff-omeroreadwrite idr0070-kerwin-hdbr]$ sudo -Es git checkout will/label_images_removal
M   experimentA/idr0070-experimentA-bulk.yml
HEAD is now at 4b08e7f Fix annotations.csv for [label image] images

as omero-server


(venv3) bash-5.1$ /opt/omero/server/venv3/bin/python /uod/idr/metadata/idr-utils/scripts/annotate/check_annotations.py Project:1104 idr0070-experimentA-annotation.csv --output /tmp/idr0070_errors.csv
All images are unique and have annotations.

$ omero delete --report Annotation:25135328
Previous session expired for demo on localhost:4064
Server: [localhost:4064]
Username: [demo]
Password:
Created session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
omero.cmd.Delete2 Annotation:25135328 ok
Steps: 6
Elapsed time: 0.492 secs.
Flags: []
Deleted objects
  FileAnnotation:25135328
  ProjectAnnotationLink:1667
  OriginalFile:29676048

/opt/omero/server/OMERO.server/bin/omero metadata populate --report --batch 1000 --file idr0070-experimentA-annotation.csv Project:1104

/opt/omero/server/OMERO.server/bin/omero metadata populate --context bulkmap --batch 100 --cfg idr0070-experimentA-bulkmap-config.yml Project:1104