Open will-moore opened 1 year ago
Fails with:
2023-02-27 11:54:05,356 [main] WARN loci.formats.FormatHandler - Ignoring extra series for well #95
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.esotericsoftware.kryo.util.UnsafeUtil (file:/home/dlindner/bioformats2raw-0.7.0-SNAPSHOT/lib/kryo-2.24.0.jar) to constructor java.nio.DirectByteBuffer(long,int,java.lang.Object)
WARNING: Please consider reporting this to the maintainers of com.esotericsoftware.kryo.util.UnsafeUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
2023-02-27 11:54:35,469 [main] ERROR c.g.bioformats2raw.Converter - Error while writing series 91
java.lang.IllegalArgumentException: Invalid series: 91 index=91
at loci.formats.FormatReader.seriesToCoreIndex(FormatReader.java:1267)
at loci.formats.FormatReader.setSeries(FormatReader.java:928)
at loci.formats.ReaderWrapper.setSeries(ReaderWrapper.java:375)
at loci.formats.ReaderWrapper.setSeries(ReaderWrapper.java:375)
at loci.formats.ReaderWrapper.setSeries(ReaderWrapper.java:375)
at com.glencoesoftware.bioformats2raw.Converter.lambda$write$1(Converte
The error rings a bell as I recall there were historical issues with some data from this historical submission. Does this happen for all plates? If not, do you have a path to a representative sample?
Going to use idr-ftp.openmicroscopy.org
for conversion...
$ ssh -A idr-ftp.openmicroscopy.org
$ cd /data
$ sudo mkdir ngff && sudo chown wmoore ngff && cd ngff
$ conda create -n bioformats2raw python=3.9
$ conda activate bioformats2raw
$ conda install -c ome bioformats2raw
$ wget https://github.com/IDR/bioformats2raw/releases/download/v0.6.0-24/bioformats2raw-0.6.0-24.zip
$ unzip bioformats2raw-0.6.0-24.zip
$ sudo -Es git clone git@github.com:IDR/idr-metadata.git
$ mkdir idr0004
$ screen -S idr0004_ngff
$ conda activate bioformats2raw
(bioformats2raw) [wmoore@idrftp-ftp ngff]$ ./bioformats2raw-0.6.0-24/bin/bioformats2raw --memo-directory ./memo idr-metadata/idr0004-thorpe-rad52/screens/P101.screen idr0004/P101.ome.zarr
OpenJDK 64-Bit Server VM warning: You have loaded library /tmp/opencv_openpnp5470699920951085994/nu/pattern/opencv/linux/x86_64/libopencv_java342.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
Exception in thread "main" picocli.CommandLine$ExecutionException: Error while calling command (com.glencoesoftware.bioformats2raw.Converter@16150369): java.io.FileNotFoundException: /uod/idr/filesets/idr0004-thorpe-rad52/Rad52_old/Rad52/P101/a2 (No such file or directory)
at picocli.CommandLine.executeUserObject(CommandLine.java:1962)
at picocli.CommandLine.access$1300(CommandLine.java:145)
at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2346)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2311)
at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:2172)
at picocli.CommandLine.parseWithHandlers(CommandLine.java:2550)
at picocli.CommandLine.parseWithHandler(CommandLine.java:2485)
at picocli.CommandLine.call(CommandLine.java:2761)
at com.glencoesoftware.bioformats2raw.Converter.main(Converter.java:2192)
Caused by: java.io.FileNotFoundException: /uod/idr/filesets/idr0004-thorpe-rad52/Rad52_old/Rad52/P101/a2 (No such file or directory)
at java.base/java.io.RandomAccessFile.open0(Native Method)
at java.base/java.io.RandomAccessFile.open(RandomAccessFile.java:345)
at java.base/java.io.RandomAccessFile.<init>(RandomAccessFile.java:259)
Of course - idr-ftp
doesn't have /uod/idr/filesets/
mounted!
Size estimate of data... uint16 (2 bytes per pixel), z:10 c:2, single field, 96-wells (some have missing wells), 47 plates
2 672 510 10 2 96 47 / 1000000000
61.8 GB
cc: @dgault in case it rings a bell for him as well.
@melissalinkert and I did some dredging of old synapses re: rad52 during the formats meeting:
omero-cli-zarr
?I just used bioformats2raw, I didn't use any specific IDR reader version. Yes, probably easiest to use cli-zarr export. Let me try...
Unfortunately doesn't work either:
Exporting to P101.ome.zarr (0.4)
Traceback (most recent call last):
File "/home/dlindner/miniconda3/envs/myenv/bin/omero", line 11, in <module>
sys.exit(main())
File "/home/dlindner/miniconda3/envs/myenv/lib/python3.9/site-packages/omero/main.py", line 125, in main
rv = omero.cli.argv()
File "/home/dlindner/miniconda3/envs/myenv/lib/python3.9/site-packages/omero/cli.py", line 1784, in argv
cli.invoke(args[1:])
File "/home/dlindner/miniconda3/envs/myenv/lib/python3.9/site-packages/omero/cli.py", line 1222, in invoke
stop = self.onecmd(line, previous_args)
File "/home/dlindner/miniconda3/envs/myenv/lib/python3.9/site-packages/omero/cli.py", line 1299, in onecmd
self.execute(line, previous_args)
File "/home/dlindner/miniconda3/envs/myenv/lib/python3.9/site-packages/omero/cli.py", line 1381, in execute
args.func(args)
File "/home/dlindner/miniconda3/envs/myenv/lib/python3.9/site-packages/omero_zarr/cli.py", line 125, in _wrapper
return func(self, *args, **kwargs)
File "/home/dlindner/miniconda3/envs/myenv/lib/python3.9/site-packages/omero_zarr/cli.py", line 345, in export
plate_to_zarr(plate, args)
File "/home/dlindner/miniconda3/envs/myenv/lib/python3.9/site-packages/omero_zarr/raw_pixels.py", line 311, in plate_to_zarr
write_plate_metadata(
File "/home/dlindner/miniconda3/envs/myenv/lib/python3.9/site-packages/ome_zarr/writer.py", line 378, in write_plate_metadata
"wells": _validate_plate_wells(wells, rows, columns, fmt=fmt),
File "/home/dlindner/miniconda3/envs/myenv/lib/python3.9/site-packages/ome_zarr/writer.py", line 157, in _validate_plate_wells
raise ValueError("Empty wells list")
ValueError: Empty wells list
Tried some other plates again with bioformats2raw, all failed with errors like
...
2023-08-07 13:49:46,548 [main] WARN loci.formats.FormatHandler - Ignoring extra series for well #94
2023-08-07 13:49:46,586 [main] WARN loci.formats.FormatHandler - Ignoring extra series for well #95
2023-08-07 13:50:11,914 [main] ERROR c.g.bioformats2raw.Converter - Error while writing series 82
java.lang.IllegalArgumentException: Invalid series: 82 index=82
at loci.formats.FormatReader.seriesToCoreIndex(FormatReader.java:1267)
at loci.formats.FormatReader.setSeries(FormatReader.java:928)
at loci.formats.ReaderWrapper.setSeries(ReaderWrapper.java:375)
at loci.formats.ReaderWrapper.setSeries(ReaderWrapper.java:375)
at loci.formats.ReaderWrapper.setSeries(ReaderWrapper.java:375)
at com.glencoesoftware.bioformats2raw.Converter.lambda$write$1(Converter.
...
ValueError: Empty wells list
ah, at least that might be a straight-forward one to fix in Python land.
The issue is that the first Well is empty, so when we write metadata after the first Well, we have no Wells to write (also found that print of progress/eta fails).
I just pushed a fix to https://github.com/ome/omero-cli-zarr/pull/147/commits/1d726264f44e2b6cb833bcc23603e2b7e56121b5 which is the branch we're using (you'll want to use the --name_by name
option for this export too.
Awesome, thanks, @will-moore. How far away from a release on those changes do you think we are?
@joshmoore There are some issues at https://github.com/ome/omero-cli-zarr/pull/147 like whether to support removal of .pattern
from the plate name, and how to handle names with spaces.
Probably I'll remove the .pattern
renaming since that can be handled via renaming on cli after export, as I've done at https://github.com/IDR/idr-metadata/issues/638
Don't know how to handle whitespace in names. Tried a couple of times to create zarrs with whitespace in names but don't remember what I tried now.
@joshmoore Added more testing to https://github.com/ome/omero-cli-zarr/pull/147 about writing zarrs with various names. Turns out that whitespace isn't the issue but []
being recognised as regex is the problem.
👍 Thanks Will, seems to work now. I'll carry on converting idr0004 then and upload to biostudies.
Converted and uploaded to Biostudies.
We seem to be missing a zip as I only see 46 .zip
on the page at https://www.ebi.ac.uk/biostudies/submissions/files?path=%2Fuser%2Fidr0004 but we need 47 https://idr.openmicroscopy.org/webclient/?show=screen-202
Used this JS code on the submissions page above to load names from IDR and compare:
let url = "https://idr.openmicroscopy.org/webclient/api/plates/?id=202"
let idr_plates = await fetch(url).then(rsp => rsp.json());
let idr_names = idr_plates.plates.map(p => p.name);
let names = [];
[].forEach.call(document.querySelectorAll("div [role='row'] .ag-cell[col-id='name']"), function(div) {
names.push(div.innerHTML.trim().replace(".ome.zarr.zip", ""));
});
idr_names.forEach(n => {if (names.indexOf(n) == -1) {console.log(n)}; });
It doesn't find any idr_names
that are missing from this page, even though the idr_names.length
is 47 and names.length
is 46.
It turns out that there are 2 plates named P132
!
https://idr.openmicroscopy.org/webclient/?show=plate-1774
https://idr.openmicroscopy.org/webclient/?show=plate-1966
They both appear to have the same Wells, but in different positions! This is going to need some thought as our current workflow relies on different-named Filesets
cc @sbesson
It looks like this observation on the duplication of plate P132 was already made but not actioned
They both appear to have the same Wells, but in different positions!
I think more specifically, the problem is that well E7 is missing in https://idr.openmicroscopy.org/webclient/?show=plate-1774 and all wells from E8 onwards are shifted by one compared to https://idr.openmicroscopy.org/webclient/?show=plate-1966.
Using the metadata as the source of truth, there are 9 wells marked with "images not available": C4, C8, C10, E9, F10, F11, G5, H3, H12. This matches https://idr.openmicroscopy.org/webclient/?show=plate-1966 which seems to have the correct plate representation.
it might be worth looking into a cleanup as a preamble of the NGFF replacement as this should simplify matters. A proposed workflow would be to:
Leaving you and @francesw to assess the validity and the prioritization of the proposal above.
Testing cleanup as suggested on idr0138-pilot
...
Rows in the OMERO.table that need to be removed: https://idr.openmicroscopy.org/webclient/omero_table/14209157/?query=Plate-1774
Tried running this in a screen -S idr0004_cleanup
but it hasn't completed > 12 hrs later:
$ omero delete Plate:1774 --report --dry-run
Might need an alternative strategy for deleting a Plate?
Have you unlinked the map annotations linked to the images of these plates first, especially the multi-linked annotations (organism, genes...) before running the deletion?
No, I expected to be able to delete the plate and leave the annotations in place (automatically deleting the annotation links).
That was what I was trying to establish by the --dry-run
delete.
But I guess that's just too much graph traversal and we need to manually delete them first, so I'll give that a go...
omero metadata populate --context deletemap --report --wait 300 --batch 100 --localcfg '{"ns":["openmicroscopy.org/mapr/organism", "openmicroscopy.org/maprnmicroscopy.org/mapr/sirna", "openmicroscopy.org/mapr/compound", "openmicroscopy.org/mapr/protein"], "typesToIgnore":["Annotation"]}' --cfg idr0004-screenA-bulkmap-config.yml Plate:1774
omero metadata populate --context deletemap --report --wait 300 --batch 100 --cfg idr0004-screenA-bulkmap-config.yml Plate:1774
This is much faster now, but doesn't include Images!?
$ omero delete Plate:1774 --report --dry-run
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
omero.cmd.Delete2 Plate:1774 Dry run performed
ok
Steps: 4
Elapsed time: 0.805 secs.
Flags: []
Deleted objects
Plate:1774
ScreenPlateLink:1798
Well:471259-471354
WellSample:678116-678201
Trying to delete one Image of that Plate gives a clue...
$ omero delete Image:694123 --report --dry-run
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
omero.cmd.Delete2 Image:694123 failed: 'graph-fail'
failed: within Fileset[12933] may not delete Image[694123] while Image[694124] remains
Steps: 4
Elapsed time: 5.232 secs.
Flags: [FAILURE, CANCELLED]
Test delete of Fileset... Ran this in a screen, but almost 24 hours later it still hasn't completed...
$ omero delete Fileset:12933 --report --dry-run
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
To check that the Plate P132.ome.zarr.zip
uploaded to BioStudies corresponds to https://idr.openmicroscopy.org/webclient/?show=plate-1966, I downloaded it from https://www.ebi.ac.uk/biostudies/submissions/files?path=%2Fuser%2Fidr0004 then copied to ome-zarr-dev1.openmicroscopy.org
$ rsync -rvP --progress P132.ome.zarr.zip ome-zarr-dev1.openmicroscopy.org:/lifesci/groups/jrs/wmoore/
Unzipped and copied to /uod/idr/objectstore/minio/idr/v0.4/idr0004/P132.ome.zarr/
, looks correct in vizarr:
As discussed at IDR meeting, it's possible that deleting the Fileset hangs because something in the graph (e.g. orphan images) is still annotated. Try to exclude those objects from the graph...
$ omero delete Fileset:12933 --report --dry-run --exclude Annotation
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
omero.cmd.Delete2 Fileset:12933 Dry run performed
ok
Steps: 4
Elapsed time: 94.283 secs.
Flags: []
Deleted objects
Detector:8982
DetectorSettings:9740
Instrument:8882
FilesetAnnotationLink:12733
ImageAnnotationLink:23168513,23168665,23176204,23176205
Channel:1956459-1956638
Image:694080-694169
LogicalChannel:25916-25918
OriginalFile:3426899-3426986
Pixels:694080-694169
ChannelBinding:1316427-1316598,1885510,1885511,1924759,1924760
QuantumDef:478614-478699,760237,798855
RenderingDef:478614-478699,760237,798855
Thumbnail:480230-480319,3417010,3417012,3417014,3417016,3417018,3417020,3417022,3417024,3417026,3417028,3417030,3417032,3417034-3417046,3417048,3417050-3417052,3417054,3417056,3417058,3417060-3417063,3417065-3417068,3417070-3417078,3417080,3417082-3417084,3417086,3417088,3417090,3417092,3417094,3417096,3417098,3417100,3417102,3417104,3417106,3417108-3417119,3417121-3417124,3417127,3417128,3417133,3417135-3417141
Fileset:12933
FilesetEntry:3375456-3375542
FilesetJobLink:62911-62915
IndexingJob:64454
JobOriginalFileLink:15860
MetadataImportJob:64451
PixelDataJob:64452
ThumbnailGenerationJob:64453
UploadJob:64450
StatsInfo:1315241-1315412
...
This runs quickly and shows 4 ImageAnnotationLinks, which may explain the issues above. The last 3 images (Image:694080-694169) appear to be orphans - these links don't show images: https://idr.openmicroscopy.org/webclient/?show=image-694169 https://idr.openmicroscopy.org/webclient/?show=image-694168 https://idr.openmicroscopy.org/webclient/?show=image-694167 But they don't have annotations...
$ psql -U omero -d idr -h $DBHOST -c "select * from ImageAnnotationLink where id in (23168513,23168665,23176204,23176205)"
id | permissions | version | child | creation_id | external_id | group_id | owner_id | update_id | parent
----------+-------------+---------+---------+-------------+-------------+----------+----------+-----------+--------
23168513 | -56 | | 6625787 | 58371837 | | 3 | 2 | 58371837 | 694089
23168665 | -56 | | 6625016 | 58371838 | | 3 | 2 | 58371838 | 694089
23176204 | -56 | | 6668725 | 58371854 | | 3 | 2 | 58371854 | 694089
23176205 | -56 | | 6668726 | 58371854 | | 3 | 2 | 58371854 | 694089
Another orphan image has the annotations: https://idr.openmicroscopy.org/webclient/?show=image-694089 This is part of the Fileset and will be deleted above.
All 4 of those annotations are Map Annotations. https://idr.openmicroscopy.org/webclient/api/annotations/?type=map&image=694089
$ psql -U omero -d idr -h $DBHOST -c "select id,ns from Annotation where id in (6625787,6625016,6668725,6668726)"
id | ns
---------+--------------------------------------------
6625016 | openmicroscopy.org/mapr/gene
6625787 | openmicroscopy.org/mapr/organism
6668725 | openmicroscopy.org/mapr/gene/supplementary
6668726 | openmicroscopy.org/omero/bulk_annotations
(4 rows)
So I think the next steps are:
omero delete Plate:1774 --report
omero delete Fileset:12933 --report --exclude Annotation
python /uod/idr/metadata/idr-utils/scripts/annotate/clean_orphaned_maps.py
and check if those orphaned MapAnnotations have been deleted.Testing cleanup on idr0138-pilot...
It looks like the $ omero delete Plate:1774
I tried at one point last week has finally completed (I assumed it had failed)
since that Plate is gone.
But the Fileset remains:
$ omero obj get Fileset:12933
id=12933
templatePrefix=demo_2/2015-10/01/08-11-23.303/
version=
And same for Images.
This doesn't find many annotations...
$ python clean_orphaned_maps.py
INFO:omero.util.Resources:Starting
INFO:omero.util.Resources:Halted
INFO:omero.util.Resources:Starting
INFO:root:Found 0 orphaned Organism maps
INFO:root:Found 0 orphaned Antibody maps
INFO:root:Found 0 orphaned Gene maps
INFO:root:Found 0 orphaned Cell Line maps
INFO:root:Found 0 orphaned Phenotype maps
INFO:root:Found 0 orphaned siRNA maps
INFO:root:Found 0 orphaned Compound maps
INFO:root:Found 0 orphaned Protein maps
INFO:root:Found 110 orphaned Notebook maps
INFO:root:Deleting 110 maps
INFO:root:Found 0 orphaned Study Info maps
INFO:root:Found 0 orphaned Study Components maps
INFO:omero.util.Resources:Halted
Delete Fileset... (output same as with --dry-run
above):
omero delete Fileset:12933 --report --exclude Annotation
Confirmed that Image now deleted:
$ omero obj get Image:694080
No object found: Image:694080
Orphaned image with annotations deleted too. This returns nothing... http://localhost:1040/webclient/api/annotations/?type=map&image=694089
python clean_orphaned_maps.py
Found no more orphaned annotations, but these 4 still exist...
$ psql -U omero -d idr -h $DBHOST -c "select id,ns from Annotation where id in (6625787,6625016,6668725,6668726)"
id | ns
---------+--------------------------------------------
6625016 | openmicroscopy.org/mapr/gene
6625787 | openmicroscopy.org/mapr/organism
6668725 | openmicroscopy.org/mapr/gene/supplementary
6668726 | openmicroscopy.org/omero/bulk_annotations
(4 rows)
And the last 2 don't seem to be linked to any Wells or Images (those namespaces aren't handled by clean_orphaned_maps.py
), e.g.
$ psql -U omero -d idr -h $DBHOST -c "select * from ImageAnnotationLink where child = 6668725"
id | permissions | version | child | creation_id | external_id | group_id | owner_id | update_id | parent
----+-------------+---------+-------+-------------+-------------+----------+----------+-----------+--------
(0 rows)
Deleted those 2 manually (nothing else got deleted)
$ omero delete Annotation:6668725 --report
$ omero delete Annotation:6668726 --report
@francesw the Plates uploaded to BioStudies can be submitted now (delete and cleanup of Plate doesn't need to hold that up). I'll assign this issue to you, even though I'll continue to look at cleanup steps...
To cleanup OMERO.table, I think we need to completely remove all annotations from the Screen and re-create them... Running on idr0138-pilot...
$ omero metadata populate --context deletemap --report --wait 300 --batch 100 --localcfg '{"ns":["openmicroscopy.org/mapr/organism", "openmicroscopy.org/mapr/antibody", "openmicroscopy.org/mapr/gene", "openmicroscopy.org/mapr/cell_line", "openmicroscopy.org/mapr/phenotype", "openmicroscopy.org/mapr/sirna", "openmicroscopy.org/mapr/compound", "openmicroscopy.org/mapr/protein"], "typesToIgnore":["Annotation"]}' --cfg idr0004-screenA-bulkmap-config.yml Screen:202
omero metadata populate --context deletemap --report --wait 300 --batch 100 --cfg idr0004-screenA-bulkmap-config.yml Screen:202
python /uod/idr/metadata/idr-utils/scripts/annotate/clean_orphaned_maps.py
omero metadata deletebulkanns Screen:202
Checking annotations..
$ python /uod/idr/metadata/idr-utils/scripts/annotate/check_annotations.py Screen:202 idr0004-screenA-annotation.csv
WARNING: There are additional entries in the csv file which don't match any images:
This corresponds to the fact that we previously allowed OMERO.table rows that didn't match any row. The 10 rows for missing Wells of Plate-1744 can be seen at https://idr.openmicroscopy.org/webclient/omero_table/14209157/?query=(Well%3C0)%26(Plate==1774) There are also similar rows for all the other Plates with missing Wells.
If I ignore that warning and continue with annotation...
omero metadata populate --report --batch 1000 --file idr0004-screenA-annotation.csv Screen:202
omero metadata populate --context bulkmap --batch 100 --cfg idr0004-screenA-bulkmap-config.yml Screen:202
This now finds the 9 empty wells for Plate-1966 (Well<0)&(Plate==1966)
:
http://localhost:1080/webclient/omero_table/63972678/?query=(Well%3C0)%26(Plate==1966)
C4, C8, C10, E9, F10, F11, G5, H3, H11 - that corresponds to https://github.com/IDR/idr-metadata/issues/637#issuecomment-1689678066
Running cleanup as above on idr-next...
Remove all annotations from the Screen...
screen -S idr0004_cleanup
source /opt/omero/server/venv3/bin/activate
omero login
cd /uod/idr/metadata/idr0004-
omero metadata populate --context deletemap --report --wait 300 --batch 100 --localcfg '{"ns":["openmicroscopy.org/mapr/organism", "openmicroscopy.org/mapr/antibody", "openmicroscopy.org/mapr/gene", "openmicroscopy.org/mapr/cell_line", "openmicroscopy.org/mapr/phenotype", "openmicroscopy.org/mapr/sirna", "openmicroscopy.org/mapr/compound", "openmicroscopy.org/mapr/protein"], "typesToIgnore":["Annotation"]}' --cfg idr0004-screenA-bulkmap-config.yml Screen:202
omero metadata populate --context deletemap --report --wait 300 --batch 100 --cfg idr0004-screenA-bulkmap-config.yml Screen:202
python /uod/idr/metadata/idr-utils/scripts/annotate/clean_orphaned_maps.py
INFO:omero.util.Resources:Starting
INFO:omero.util.Resources:Starting
INFO:omero.util.Resources:Halted
INFO:root:Found 0 orphaned Organism maps
INFO:root:Found 0 orphaned Antibody maps
INFO:root:Found 5 orphaned Gene maps
INFO:root:Deleting 5 maps
INFO:root:Found 0 orphaned Cell Line maps
INFO:root:Found 0 orphaned Phenotype maps
INFO:root:Found 0 orphaned siRNA maps
INFO:root:Found 0 orphaned Compound maps
INFO:root:Found 0 orphaned Protein maps
INFO:root:Found 0 orphaned Notebook maps
INFO:root:Found 0 orphaned Study Info maps
INFO:root:Found 0 orphaned Study Components maps
INFO:omero.util.Resources:Halted
omero metadata deletebulkanns Screen:202
omero delete Plate:1774 --report
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
omero.cmd.Delete2 Plate:1774 ok
Steps: 6
Elapsed time: 0.599 secs.
Flags: []
Deleted objects
Plate:1774
ScreenPlateLink:1798
Well:471259-471354
WellSample:678116-678201
omero delete Fileset:12933 --report --exclude Annotation
# delete orphaned map annotations
$ omero delete Annotation:6668725 --report
Deleted objects
MapAnnotation:6668725
$ omero delete Annotation:6668726 --report
Deleted objects
MapAnnotation:6668726
python /uod/idr/metadata/idr-utils/scripts/annotate/check_annotations.py Screen:202 idr0004-screenA-annotation.csv
WARNING: There are additional entries in the csv file which don't match any images:
All images are unique and have annotations.
omero metadata populate --report --batch 1000 --file idr0004-screenA-annotation.csv Screen:202
omero metadata populate --context bulkmap --batch 100 --cfg idr0004-screenA-bulkmap-config.yml Screen:202
As before, this now finds the 9 empty wells for Plate-1966 (Well<0)&(Plate==1966): http://localhost:12345/webclient/omero_table/48436502/?query=(Well%3C0)%26(Plate==1966) now finds the
C4, C8, C10, E9, F10, F11, G5, H3, H11 - that corresponds to https://github.com/IDR/idr-metadata/issues/637#issuecomment-1689678066
Testing mkngff on idr0125-pilot...
idr0004/P170.ome.zarr,S-BIAD867/00d88a93-8d21-4a50-b8b5-60f11bcae0d3,12953
idr0004/P144.ome.zarr,S-BIAD867/02c5d63f-36f5-4862-9682-ec3a2702a1e5,12945
idr0004/P145.ome.zarr,S-BIAD867/06e3fba2-825a-441d-a3cb-2084515b1b14,12947
idr0004/P111.ome.zarr,S-BIAD867/0bb5992f-e8d8-45b1-9e5d-d0fb8325aabb,12917
idr0004/P120.ome.zarr,S-BIAD867/0d3e6be1-0c0a-42ef-8775-e3557c359b2d,12990
idr0004/P101.ome.zarr,S-BIAD867/103d9428-b86b-4f4e-84d8-966b5d89aae1,12909
idr0004/P105.ome.zarr,S-BIAD867/1d37d3c1-08f2-42a9-8c61-97fde7f221dd,12911
idr0004/P142.ome.zarr,S-BIAD867/264695a5-9c23-4204-aa46-dd7b709fe137,12943
idr0004/P149.ome.zarr,S-BIAD867/2b2aab5c-6d70-4e1d-8336-11e7d018759d,12951
idr0004/P138.ome.zarr,S-BIAD867/2bff1cee-81c0-467b-98f5-01db06f4d042,12939
Only let it run for 3 filesets... - took about 3 minutes each...
Found prefix demo_2/2015-10/01 // 08-49-38.885 for fileset 12953
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2015-10/01/08-49-38.885
Creating dir at /data/OMERO/ManagedRepository/demo_2/2015-10/01/08-49-38.885_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2015-10/01/08-49-38.885_mkngff/00d88a93-8d21-4a50-b8b5-60f11bcae0d3.zarr -> /bia-integrator-data/S-BIAD867/00d88a93-8d21-4a50-b8b5-60f11bcae0d3/00d88a93-8d21-4a50-b8b5-60f11bcae0d3.zarr
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/2015-10/01 // 08-34-48.864 for fileset 12945
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2015-10/01/08-34-48.864
Creating dir at /data/OMERO/ManagedRepository/demo_2/2015-10/01/08-34-48.864_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2015-10/01/08-34-48.864_mkngff/02c5d63f-36f5-4862-9682-ec3a2702a1e5.zarr -> /bia-integrator-data/S-BIAD867/02c5d63f-36f5-4862-9682-ec3a2702a1e5/02c5d63f-36f5-4862-9682-ec3a2702a1e5.zarr
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/2015-10/01 // 08-37-19.100 for fileset 12947
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2015-10/01/08-37-19.100
Creating dir at /data/OMERO/ManagedRepository/demo_2/2015-10/01/08-37-19.100_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2015-10/01/08-37-19.100_mkngff/06e3fba2-825a-441d-a3cb-2084515b1b14.zarr -> /bia-integrator-data/S-BIAD867/06e3fba2-825a-441d-a3cb-2084515b1b14/06e3fba2-825a-441d-a3cb-2084515b1b14.zarr
bash-4.2$ for r in $(cat $IDRID.csv); do
> fsid=$(echo $r | cut -d',' -f3)
> psql -U omero -d idr -h $DBHOST -f "$fsid.sql"
> done
UPDATE 95
BEGIN
mkngff_fileset
----------------
5287570
(1 row)
COMMIT
UPDATE 98
BEGIN
mkngff_fileset
----------------
5287571
(1 row)
COMMIT
UPDATE 95
BEGIN
mkngff_fileset
----------------
5287572
(1 row)
COMMIT
Waiting on memo regeneration, e.g. http://localhost:1080/webclient/?show=image-698736
Looks good - thumbnails updated..
To test mkngff on all 46 Plates on idr-testing... NB - plate to be deleted above will be ignored by mkngff based on table below...
https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/pages/S-BIAD867.html
idr0004/P170.ome.zarr,S-BIAD867/00d88a93-8d21-4a50-b8b5-60f11bcae0d3,12953
idr0004/P144.ome.zarr,S-BIAD867/02c5d63f-36f5-4862-9682-ec3a2702a1e5,12945
idr0004/P145.ome.zarr,S-BIAD867/06e3fba2-825a-441d-a3cb-2084515b1b14,12947
idr0004/P111.ome.zarr,S-BIAD867/0bb5992f-e8d8-45b1-9e5d-d0fb8325aabb,12917
idr0004/P120.ome.zarr,S-BIAD867/0d3e6be1-0c0a-42ef-8775-e3557c359b2d,12990
idr0004/P101.ome.zarr,S-BIAD867/103d9428-b86b-4f4e-84d8-966b5d89aae1,12909
idr0004/P105.ome.zarr,S-BIAD867/1d37d3c1-08f2-42a9-8c61-97fde7f221dd,12911
idr0004/P142.ome.zarr,S-BIAD867/264695a5-9c23-4204-aa46-dd7b709fe137,12943
idr0004/P149.ome.zarr,S-BIAD867/2b2aab5c-6d70-4e1d-8336-11e7d018759d,12951
idr0004/P138.ome.zarr,S-BIAD867/2bff1cee-81c0-467b-98f5-01db06f4d042,12939
idr0004/P128.ome.zarr,S-BIAD867/2f42ce30-0ac3-4056-98d8-9ef1608dc019,13124
idr0004/P115.ome.zarr,S-BIAD867/35cfc0db-7795-497c-aed5-1ae591b2d9f1,12919
idr0004/P132.ome.zarr,S-BIAD867/3df515f9-75c8-4cc4-8c29-480ae9817880,13125
idr0004/P109.ome.zarr,S-BIAD867/430b28a8-a2c9-4d71-8e54-5b45ca051c51,12915
idr0004/P112.ome.zarr,S-BIAD867/4f4a0699-9f42-4272-a929-9e0139ec3857,12918
idr0004/P121.ome.zarr,S-BIAD867/59ef0f42-3306-411c-a168-8977911fe63c,12923
idr0004/P117.ome.zarr,S-BIAD867/66925bed-7857-461d-9c56-f6a9c5b7dd69,12920
idr0004/P133.ome.zarr,S-BIAD867/66f4e3df-441c-4003-be2b-1b9f6984543a,12934
idr0004/P140.ome.zarr,S-BIAD867/685006c9-293e-43f1-819b-2669c5916add,12941
idr0004/P110.ome.zarr,S-BIAD867/686fa20f-c678-4d3c-8367-6572dc5aca4d,12916
idr0004/P129.ome.zarr,S-BIAD867/74005c12-4837-48f5-b6f2-e6eb71a89ac3,12929
idr0004/P134.ome.zarr,S-BIAD867/74e0533d-d06b-4a46-bb52-b155190e3c8d,12935
idr0004/P139.ome.zarr,S-BIAD867/7ac3742c-2c37-45b2-8d04-3ef62daeeb8d,12940
idr0004/P146.ome.zarr,S-BIAD867/7c35875b-1f46-46a4-95db-a6efceba0ae9,12948
idr0004/P148.ome.zarr,S-BIAD867/857896cf-5b33-40e4-8fbb-56b0aa15decd,12950
idr0004/P130.ome.zarr,S-BIAD867/881a57a6-6da1-4306-bd8b-db0da2c8a076,12930
idr0004/P119.ome.zarr,S-BIAD867/8c9e1c63-61cc-4264-81ff-15534e962fcb,12922
idr0004/P150.ome.zarr,S-BIAD867/904d4a80-a3d3-4af9-af47-8a0f6e1776b7,12952
idr0004/P118.ome.zarr,S-BIAD867/a0394bf7-13c9-44c4-812d-c29e3b765bc0,12921
idr0004/P126.ome.zarr,S-BIAD867/a3ff8e5d-f665-4c18-a0eb-543730ca7b12,12927
idr0004/P135.ome.zarr,S-BIAD867/af83f2a1-2d0d-4d16-bcc3-6c83bf4e6b98,12936
idr0004/P136.ome.zarr,S-BIAD867/bc7ea09d-9a23-4d7a-88a4-af48459bd9ee,12937
idr0004/P147.ome.zarr,S-BIAD867/bf30beb6-72f5-4235-8fdc-12c946636951,12949
idr0004/P108.ome.zarr,S-BIAD867/c310fd5e-d6d9-49bb-840e-4cbb7b275b81,12914
idr0004/P131.ome.zarr,S-BIAD867/d408a260-f019-45e9-8e46-d09a874bcbc5,12932
idr0004/P107.ome.zarr,S-BIAD867/d98c3997-7b71-440d-a815-4f9bc70b8b22,12913
idr0004/P143.ome.zarr,S-BIAD867/dcd2adf1-10c1-4960-b01c-1426e1b46f6b,12944
idr0004/P125.ome.zarr,S-BIAD867/dff92046-c53b-4948-95be-cea12e577e9e,12926
idr0004/P171.ome.zarr,S-BIAD867/e3283a6a-d25b-41e1-8ab7-1837b89e3a6e,12954
idr0004/P123.ome.zarr,S-BIAD867/e6a5a8ba-3cdb-425d-b701-c1c0382a3eeb,12924
idr0004/P102.ome.zarr,S-BIAD867/ee396ed4-07f1-4351-ac9e-5956dd92000b,12910
idr0004/P124.ome.zarr,S-BIAD867/ee8872c8-e4b1-41fa-aa4f-a9e3e200c540,12925
idr0004/P141.ome.zarr,S-BIAD867/ef42a819-34ad-4fbb-b7b9-3e2eb0f4fd17,12942
idr0004/P137.ome.zarr,S-BIAD867/f5ce45be-0b8c-4539-ae29-66978555f0ec,12938
idr0004/P106.ome.zarr,S-BIAD867/f791de8d-cd01-4303-8b08-67cbbbb45b64,12912
idr0004/P127.ome.zarr,S-BIAD867/fcc7c91a-f0e6-43f4-93a6-220e9224eda5,12928
Started mkngff
12:24. "about 3 mins each" x 46 = 2.5 hours...
Viewing images on first Plate P101
...
$ grep -A 2 "saved memo" /opt/omero/server/OMERO.server/var/log/Blitz-0.log | grep -A 2 "185_mkngff"
2023-09-11 20:22:49,697 DEBUG [ loci.formats.Memoizer] (l.Server-6) saved memo file: /data/OMERO/BioFormatsCache/data/OMERO/ManagedRepository/demo_2/2015-10/01/07-25-30.185_mkngff/103d9428-b86b-4f4e-84d8-966b5d89aae1.zarr/..zattrs.bfmemo (171077 bytes)
2023-09-11 20:22:49,697 DEBUG [ loci.formats.Memoizer] (l.Server-6) start[1694463707838] time[61858] tag[loci.formats.Memoizer.setId]
2023-09-11 20:22:49,697 INFO [ ome.io.nio.PixelsService] (l.Server-6) Creating BfPixelBuffer: /data/OMERO/ManagedRepository/demo_2/2015-10/01/07-25-30.185_mkngff/103d9428-b86b-4f4e-84d8-966b5d89aae1.zarr/.zattrs Series: 0
SetId took about a minute.
Results of check_pixels...
Checking 50 images from each plate: https://github.com/IDR/idr-utils/pull/55#issuecomment-1824755559
P132
has wells shifted by 1 place after E8 https://github.com/IDR/idr-utils/pull/55#issuecomment-1824728538P115
and P124
have struct.error: unpack requires a buffer of 614400 bytes
on some images. Plate P132
with mismatches found by check_pixels was previously duplicated in IDR (and 1 of the duplicates deleted in last release). It looks like we have the wrong NGFF plate for the remaining IDR plate.
Need to re-export...
The image at C7 on P115 with "missing chunks" at https://github.com/IDR/idr-utils/pull/55#issuecomment-1824531029 looks OK at https://ome.github.io/ome-ngff-validator/?source=https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/S-BIAD867/35cfc0db-7795-497c-aed5-1ae591b2d9f1/35cfc0db-7795-497c-aed5-1ae591b2d9f1.zarr/C/7/0/
Rendering the image is clearly corrupted: http://localhost:1080/webclient/render_image/692975/0/0/?c=1|586:2652$FFFFFF,2|257:1400$00FF00
With the duplicate plates issue
Re-export on idr-ftp /data/idr0004/
conda activate omero_zarr_export
omero zarr export Plate:1966 --name_by=name
...
Deleted P132.ome.zarr.zip
on https://www.ebi.ac.uk/biostudies/submissions/files?path=%2Fuser%2Fidr0004 and replaced:
(base) [wmoore@idrftp-ftp ~]$ sudo /root/.aspera/cli/bin/ascp -P33001 -i /root/.aspera/cli/etc/asperaweb_id_dsa.openssh -d /data/idr0004/idr0004/ bsaspera_w@hx-fasp-1.ebi.ac.uk:5f/xxx-xxx-xx-x-xxxx
P132.ome.zarr.zip 100% 364MB 377Mb/s 00:07
Completed: 372995K bytes transferred in 7 seconds
(417463K bits/sec), in 1 file, 1 directory.
Checking image above:
$ python check_pixels.py Image:692975 --max-planes=sizeC
0/1 Check Image:692975 P115 [Well C7, Field 1]
ERROR:omero.gateway:Failed to getPlane() or getTile() from rawPixelsStore
Traceback (most recent call last):
File "/Users/wmoore/Desktop/PY/omero-py/target/omero/gateway/__init__.py", line 7542, in getTiles
convertedPlane = unpack(convertType, rawPlane)
struct.error: unpack requires a buffer of 614400 bytes
Error: Image:692975 unpack requires a buffer of 614400 bytes
End: 2023-11-29 15:47:14.657066
If we print the size of the bytes right before the error above, we can see that the bytes we get is too large:
convertType = '>%d%s' % (
(planeY*planeX), pixelTypes[pixelType][0])
print("rawPlane", len(rawPlane), convertType)
if isinstance(rawPlane, bytes):
convertedPlane = unpack(convertType, rawPlane)
prints
rawPlane 685440 >307200H
685440 is what is returned when we expect 614400 bytes (2 bytes per pixel, 307200 pixels is 480 * 640). 685440 / 2 is 342720 pixels or 672 x 510.
When rendering the 1
resolution (half size) we get 336 x 255 if we ask for a big Tile:
but we don't see this when requesting 0
level (full size):
Requesting tile=1,0,0,320,240
crops to those dimensions so we get the image without spacers.
The size mismatch is because ZarrReader is assuming that all images in the Plate are the same size.
So we need to specify NOT to do this for the 2 Filesets affected by adding zarrreader.quick_read=false
into the bfoptions file. As omero-server user on idr-testing:omeroreadwrite...
Plate P115
:
vi /data/OMERO/ManagedRepository/demo_2/2015-10/01/07-46-42.965_mkngff/35cfc0db-7795-497c-aed5-1ae591b2d9f1.zarr.bfoptions
Plate P124
:
vi /data/OMERO/ManagedRepository/demo_2/2015-10/01/07-57-40.271_mkngff/ee8872c8-e4b1-41fa-aa4f-a9e3e200c540.zarr.bfoptions
Now they look like this:
omezarr.list_pixels=false
zarrreader.quick_read=false
Now delete the existing memo files...
(venv3) bash-4.2$ rm /data/OMERO/BioFormatsCache/data/OMERO/ManagedRepository/demo_2/2015-10/01/07-57-40.271_mkngff/ee8872c8-e4b1-41fa-aa4f-a9e3e200c540.zarr/..zattrs.bfmemo
(venv3) bash-4.2$ rm /data/OMERO/BioFormatsCache/data/OMERO/ManagedRepository/demo_2/2015-10/01/07-46-42.965_mkngff/35cfc0db-7795-497c-aed5-1ae591b2d9f1.zarr/..zattrs.bfmemo
Try to view images again...
Image looks good now:
python check_pixels.py Image:692975 --max-planes=sizeC
Start: 2023-12-01 13:50:59.158362
Checking Image:692975
max_planes: sizeC
max_images: 0
0/1 Check Image:692975 P115 [Well C7, Field 1]
End: 2023-12-01 13:51:18.206637
P132 plate has been updated on BioStudies.
https://hms-dbmi.github.io/vizarr/?source=https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/S-BIAD867/77517221-a983-4761-8021-c0039a7728e1/77517221-a983-4761-8021-c0039a7728e1.zarr now matches https://idr.openmicroscopy.org/webclient/?show=image-797783 (Fileset ID: 13125)
On idr0125-pilot...
$ omero mkngff sql 13125 --clientpath="https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/S-BIAD867/77517221-a983-4761-8021-c0039a7728e1/77517221-a983-4761-8021-c0039a7728e1.zarr" "/bia-integrator-data/S-BIAD867/77517221-a983-4761-8021-c0039a7728e1/77517221-a983-4761-8021-c0039a7728e1.zarr" > "idr0004/13125.sql"
$ cat idr0004/13125.sql | wc
719 2855 212416
Added to https://github.com/IDR/mkngff_upgrade_scripts/commit/db9efb8e2d54d7abffff2dba57bb456d4bf14c2a
The sql generated above when logged-in to idr.openmicroscopy.org, so we have the original Fileset ID etc. But we don't have a test server to test that, as they've all had the idr0004 Plate P132 updated with mkngff already.
So lets generate fresh sql from the updated plate on idr0125-pilot...
as omero-server... logged in to localhost
omero mkngff sql 5287668 --secret=$SECRET --clientpath="https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/S-BIAD867/77517221-a983-4761-8021-c0039a7728e1/77517221-a983-4761-8021-c0039a7728e1.zarr" "/bia-integrator-data/S-BIAD867/77517221-a983-4761-8021-c0039a7728e1/77517221-a983-4761-8021-c0039a7728e1.zarr" > "idr0004/5287668.sql"
$ psql -U omero -d idr -h $DBHOST -f idr0004/5287668.sql
UPDATE 91
BEGIN
mkngff_fileset
----------------
5289226
(1 row)
COMMIT
$ omero mkngff symlink /data/OMERO/ManagedRepository 5287668 "/bia-integrator-data/S-BIAD867/77517221-a983-4761-8021-c0039a7728e1/77517221-a983-4761-8021-c0039a7728e1.zarr" --bfoptions
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2015-10/01/18-25-11.206_mkngff
Creating dir at /data/OMERO/ManagedRepository/demo_2/2015-10/01/18-25-11.206_mkngff_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2015-10/01/18-25-11.206_mkngff_mkngff/77517221-a983-4761-8021-c0039a7728e1.zarr -> /bia-integrator-data/S-BIAD867/77517221-a983-4761-8021-c0039a7728e1/77517221-a983-4761-8021-c0039a7728e1.zarr
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2015-10/01/18-25-11.206_mkngff
write bfoptions to: /data/OMERO/ManagedRepository/demo_2/2015-10/01/18-25-11.206_mkngff_mkngff/77517221-a983-4761-8021-c0039a7728e1.zarr.bfoptions
Viewing in webclient looks good - after memo regenerated...
python check_pixels.py Plate:1966 --max-planes=sizeC > /tmp/check_pix_20231219_plate1966.log
...
82/87 Check Image:797812 P132 [Well F8, Field 1]
83/87 Check Image:797813 P132 [Well A8, Field 1]
84/87 Check Image:797814 P132 [Well F12, Field 1]
85/87 Check Image:797815 P132 [Well A7, Field 1]
86/87 Check Image:797816 P132 [Well H10, Field 1]
End: 2023-12-19 16:33:26.819467
(base) bash-4.2$ grep Error !$
grep Error /tmp/check_pix_20231219_plate1966.log
On idr-next, as omero-server...
cd
git clone https://github.com/IDR/mkngff_upgrade_scripts.git
cd mkngff_upgrade_scripts/ngff_filesets/idr0004
sed -i 's/SECRETUUID/e3e1ac30-7b69-473b-98b2-428780578b1c/g' 13125.sql
$ psql -U omero -d idr -h $DBHOST -f 13125.sql
UPDATE 91
BEGIN
mkngff_fileset
----------------
6314437
(1 row)
COMMIT
$ omero mkngff symlink /data/OMERO/ManagedRepository 13125 "/bia-integrator-data/S-BIAD867/77517221-a983-4761-8021-c0039a7728e1/77517221-a983-4761-8021-c0039a7728e1.zarr" --bfoptions
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2015-10/01/18-25-11.206
Creating dir at /data/OMERO/ManagedRepository/demo_2/2015-10/01/18-25-11.206_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2015-10/01/18-25-11.206_mkngff/77517221-a983-4761-8021-c0039a7728e1.zarr -> /bia-integrator-data/S-BIAD867/77517221-a983-4761-8021-c0039a7728e1/77517221-a983-4761-8021-c0039a7728e1.zarr
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2015-10/01/18-25-11.206
write bfoptions to: /data/OMERO/ManagedRepository/demo_2/2015-10/01/18-25-11.206_mkngff/77517221-a983-4761-8021-c0039a7728e1.zarr.bfoptions
python check_pixels.py Plate:1966 --max-planes=sizeC > /tmp/check_pix_20231219_plate1966.log
...
83/87 Check Image:797813 P132 [Well A8, Field 1]
84/87 Check Image:797814 P132 [Well F12, Field 1]
85/87 Check Image:797815 P132 [Well A7, Field 1]
86/87 Check Image:797816 P132 [Well H10, Field 1]
End: 2023-12-19 17:01:26.115466
grep Error /tmp/check_pix_20231219_plate1966.log
Checking Fileset IDs still valid:
(base) Williams-MacBook-Pro:ngff_filesets wmoore$ pwd
/Users/wmoore/Desktop/IDR/mkngff_upgrade_scripts/ngff_filesets
(base) Williams-MacBook-Pro:ngff_filesets wmoore$ python parse_bia_uuids.py idr0004
46 filesets matched
idr0004-thorpe-rad52