IDR / idr-metadata

Curated metadata for all studies published in the Image Data Resource
https://idr.openmicroscopy.org
14 stars 24 forks source link

idr0026-weigelin-immunotherapy S-BIAD860 #648

Open will-moore opened 1 year ago

will-moore commented 1 year ago

idr0026-weigelin-immunotherapy

dominikl commented 1 year ago

Conversion time: 9min Import time: 72h

will-moore commented 1 year ago

Trying to estimate how much space is needed for this conversion.

First image is uint16 (2 bytes), 507 x 507 x 21 x 71 x 4 = approx 3 GB.

Images vary in size for the study, but about 111 .pattern images (see https://github.com/IDR/idr-utils/pull/56) need converting.

Maybe 300 GB or more needed (maybe up to 500 GB)?

will-moore commented 1 year ago

Looks like all the pattern files we need to convert are under:

$ ls /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/ | wc
    111     111    3970

Corresponds to image count from https://github.com/IDR/idr-utils/pull/56

$ screen -S idr0026_bf2raw

$ conda activate bioformats2raw

$ cd /data
$ sudo chown wmoore ./idr0026
$ cd idr0026

$ for i in `ls /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/`; do echo $i; /home/wmoore/bioformats2raw-0.6.0-24/bin/bioformats2raw --memo-directory ../memo /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/$i ${i%.*}.ome.zarr; done
will-moore commented 1 year ago
$ aws --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3 mb s3://idr0026
make_bucket: idr0026
$ aws --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3api put-bucket-policy --bucket idr0026 --policy file://policy.json
$ aws --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3api put-bucket-cors --bucket idr0026  --cors-configuration file://cors.json
$ /home/wmoore/mc cp -r idr0026/ uk1s3/idr0026/zarr
...3.140926_14-52-18.03.ome.zarr/OME/METADATA.ome.xml: 282.79 GiB / 282.79 GiB ━━━━━━━━━━━━━
will-moore commented 1 year ago

Checking on s3...

E.g. https://ome.github.io/ome-ngff-validator/?source=https://uk1s3.embassy.ebi.ac.uk/idr0026/zarr/3.50.6-3.140922_11-36-07.00.ome.zarr/0/

This image has only a single omero:channels, so the images appear as single channel in vizarr, even though the zarr array data is 4-channels and they look OK in validator.

Image

cc @sbesson

will-moore commented 1 year ago

However, the OME.xml looks OK, e.g. https://uk1s3.embassy.ebi.ac.uk/idr0026/zarr/3.50.6-3.140922_11-36-07.00.ome.zarr/OME/METADATA.ome.xml This has 4 channels, so maybe the .zattrs omero.channel info is not so critical when we import to OMERO. But it's still wrong!

<Pixels BigEndian="true" DimensionOrder="XYZCT" ID="Pixels:0" Interleaved="false" SignificantBits="16" SizeC="4" SizeT="71" SizeX="507" SizeY="507" SizeZ="21" Type="uint16">
<Channel ID="Channel:0:0" Name="FD6_GREEN" SamplesPerPixel="1">
<LightPath/>
</Channel>
<Channel ID="Channel:0:1" Name="FD5_BLUE" SamplesPerPixel="1">
<LightPath/>
</Channel>
<Channel ID="Channel:0:2" Name="BD8_RED" SamplesPerPixel="1">
<LightPath/>
</Channel>
<Channel ID="Channel:0:3" Name="BD7_RED" SamplesPerPixel="1">
<LightPath/>
</Channel>
will-moore commented 1 year ago

On pilot-idr0125...

sudo mkdir /idr0026 && sudo /opt/goofys --endpoint https://uk1s3.embassy.ebi.ac.uk/ -o allow_other idr0026 /idr0026

# copy metadata-only images....
screen -S idr0010_aws_sync
aws s3 sync --no-sign-request --exclude '*' --include "*/.z*" --include "*.xml" --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3://idr0026/zarr .

# import all images into Dataset

for dir in *; do
  omero import -d 15352 --transfer=ln_s --depth=100 --name=${dir/.ome.zarr/} --skip=all $dir --file /tmp/$dir.log  --errs /tmp/$dir.err;
done

$ python idr-utils/scripts/managed_repo_symlinks.py Dataset:15352 /idr0026/zarr

These look good in OMERO, compared to existing IDR

Image

sbesson commented 1 year ago

@will-moore not 100% sure of what went wrong on your conversion but using the converter library shipping with the current IDR version of Bio-Formats, I get

$ /opt/bioformats2raw/bioformats2raw-0.6.0-24/bin/bioformats2raw -p 3.49.6-3.140922_11-33-57.00.pattern 3.49.6-3.140922_11-33-57.00.zarr
OpenJDK 64-Bit Server VM warning: You have loaded library /tmp/opencv_openpnp6732331320596727672/nu/pattern/opencv/linux/x86_64/libopencv_java342.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
..>
[0/0]  99% │███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉│ 3239/3240 (0:02:38 / 0:00:00) 
[0/0] 100% │████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████│ 3240/3240 (0:02:38 / 0:00:00) 
[0/1] 100% │████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████│ 3240/3240 (0:00:04 / 0:00:00) 

and the omero key contains metadata about all four channels as expected

$ cat 3.49.6-3.140922_11-33-57.00.zarr/0/.zattrs 
{
  "multiscales" : [ {
    "metadata" : {
      "method" : "loci.common.image.SimpleImageScaler",
      "version" : "Bio-Formats 0.6.10"
    },
    "axes" : [ {
      "name" : "t",
      "type" : "time"
    }, {
      "name" : "c",
      "type" : "channel"
    }, {
      "name" : "z",
      "type" : "space"
    }, {
      "name" : "y",
      "type" : "space"
    }, {
      "name" : "x",
      "type" : "space"
    } ],
    "name" : "11-33-57_PMT - PMT [BD2_GREEN] [00]_Time Time0000.tif",
    "datasets" : [ {
      "path" : "0",
      "coordinateTransformations" : [ {
        "scale" : [ 1.0, 1.0, 1.0, 1.0, 1.0 ],
        "type" : "scale"
      } ]
    }, {
      "path" : "1",
      "coordinateTransformations" : [ {
        "scale" : [ 1.0, 1.0, 1.0, 2.0, 2.0 ],
        "type" : "scale"
      } ]
    } ],
    "version" : "0.4"
  } ],
  "omero" : {
    "channels" : [ {
      "color" : "FF0000",
      "coefficient" : 1,
      "active" : true,
      "label" : "BD2_GREEN",
      "window" : {
        "min" : 372.0,
        "max" : 15788.0,
        "start" : 372.0,
        "end" : 15788.0
      },
      "family" : "linear",
      "inverted" : false
    }, {
      "color" : "00FF00",
      "coefficient" : 1,
      "active" : true,
      "label" : "BD8_DEEPR",
      "window" : {
        "min" : 373.0,
        "max" : 16188.0,
        "start" : 373.0,
        "end" : 16188.0
      },
      "family" : "linear",
      "inverted" : false
    }, {
      "color" : "0000FF",
      "coefficient" : 1,
      "active" : true,
      "label" : "BD7_RED",
      "window" : {
        "min" : 759.0,
        "max" : 8978.0,
        "start" : 759.0,
        "end" : 8978.0
      },
      "family" : "linear",
      "inverted" : false
    }, {
      "color" : "FF0000",
      "coefficient" : 1,
      "active" : false,
      "label" : "FD6_FDRED",
      "window" : {
        "min" : 237.0,
        "max" : 12339.0,
        "start" : 237.0,
        "end" : 12339.0
      },
      "family" : "linear",
      "inverted" : false
    } ],
    "rdefs" : {
      "defaultT" : 0,
      "model" : "color",
      "defaultZ" : 13
    }
  }
}

Sounds like the best way forward would be to redo the whole conversion?

will-moore commented 1 year ago

@sbesson Looking at https://github.com/IDR/idr-metadata/issues/648#issuecomment-1598644029, it looks like I used the same version: bioformats2raw-0.6.0-24. But I'll delete and try again...

will-moore commented 1 year ago

Testing... (and failing!)...

(bioformats2raw) [wmoore@pilot-zarr1-dev test]$ pwd
/data/idr0026/test
$ ~/bioformats2raw-0.6.0-24/bin/bioformats2raw /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/3.65.9-6.141023_15-45-09.03.pattern 3.65.9-6.141023_15-45-09.03.ome.zarr

cat 3.65.9-6.141023_15-45-09.03.ome.zarr/0/.zattrs

{
  "multiscales" : [ {
    "metadata" : {
      "method" : "loci.common.image.SimpleImageScaler",
      "version" : "Bio-Formats 0.6.10"
    },
    "axes" : [ {
      "name" : "t",
      "type" : "time"
    }, {
      "name" : "c",
      "type" : "channel"
    }, {
      "name" : "z",
      "type" : "space"
    }, {
      "name" : "y",
      "type" : "space"
    }, {
      "name" : "x",
      "type" : "space"
    } ],
    "name" : "15-45-09_PMT - PMT [BD7_RED] [03]_Time Time0074.tif",
    "datasets" : [ {
      "path" : "0",
      "coordinateTransformations" : [ {
        "scale" : [ 1.0, 1.0, 1.0, 1.0, 1.0 ],
        "type" : "scale"
      } ]
    }, {
      "path" : "1",
      "coordinateTransformations" : [ {
        "scale" : [ 1.0, 1.0, 1.0, 2.0, 2.0 ],
        "type" : "scale"
      } ]
    } ],
    "version" : "0.4"
  } ],
  "omero" : {
    "channels" : [ {
      "color" : "808080",
      "coefficient" : 1,
      "active" : true,
      "label" : "Channel 0",
      "window" : {
        "min" : 332.0,
        "max" : 15788.0,
        "start" : 332.0,
        "end" : 15788.0
      },
      "family" : "linear",
      "inverted" : false
    } ],
    "rdefs" : {
      "defaultT" : 0,
      "model" : "greyscale",
      "defaultZ" : 0
    }
  }
}
will-moore commented 1 year ago

Even using the same lib as @sbesson gives me same result?!

(bioformats2raw) [wmoore@pilot-zarr1-dev test]$ /opt/bioformats2raw/bioformats2raw-0.6.0-24/bin/bioformats2raw /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/3.49.6-3.140922_11-33-57.01.pattern 3.49.6-3.140922_11-33-57.01.ome.zarr

$ cat 3.49.6-3.140922_11-33-57.01.ome.zarr/0/.zattrs
{
  "multiscales" : [ {
    "metadata" : {
      "method" : "loci.common.image.SimpleImageScaler",
      "version" : "Bio-Formats 0.6.10"
    },
    "axes" : [ {
      "name" : "t",
      "type" : "time"
    }, {
      "name" : "c",
      "type" : "channel"
    }, {
      "name" : "z",
      "type" : "space"
    }, {
      "name" : "y",
      "type" : "space"
    }, {
      "name" : "x",
      "type" : "space"
    } ],
    "name" : "11-33-57_PMT - PMT [FD6_FDRED] [01]_Time Time0028.tif",
    "datasets" : [ {
      "path" : "0",
      "coordinateTransformations" : [ {
        "scale" : [ 1.0, 1.0, 1.0, 1.0, 1.0 ],
        "type" : "scale"
      } ]
    }, {
      "path" : "1",
      "coordinateTransformations" : [ {
        "scale" : [ 1.0, 1.0, 1.0, 2.0, 2.0 ],
        "type" : "scale"
      } ]
    } ],
    "version" : "0.4"
  } ],
  "omero" : {
    "channels" : [ {
      "color" : "808080",
      "coefficient" : 1,
      "active" : true,
      "label" : "Channel 0",
      "window" : {
        "min" : 376.0,
        "max" : 15788.0,
        "start" : 376.0,
        "end" : 15788.0
      },
      "family" : "linear",
      "inverted" : false
    } ],
    "rdefs" : {
      "defaultT" : 0,
      "model" : "greyscale",
      "defaultZ" : 0
    }
  }
sbesson commented 1 year ago

I suspect there's something wrong with your environment and particularly the bioformats2raw Conda environment that you're using. Can you try deactivating Conda completely and simply running /opt/bioformats2raw/bioformats2raw-0.6.0-24/bin/bioformats2raw /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/3.49.6-3.140922_11-33-57.01.pattern 3.49.6-3.140922_11-33-57.01.ome.zarr ?

will-moore commented 1 year ago

That didn't work either!

I wanted to try on a different machine completely...

$ ssh pilot-zarr2-dev
$ cd /data
$ sudo mkdir idr0026
$ sudo chown wmoore idr0026
$ /opt/bioformats2raw/bioformats2raw-0.6.0-24/bin/bioformats2raw -p /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/3.49.6-3.140922_11-33-57.01.pattern 3.49.6-3.140922_11-33-57.01.ome.zarr
OpenJDK 64-Bit Server VM warning: You have loaded library /tmp/opencv_openpnp632426155291925201/nu/pattern/opencv/linux/x86_64/libopencv_java342.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
Exception in thread "main" picocli.CommandLine$ExecutionException: Error while calling command (com.glencoesoftware.bioformats2raw.Converter@6ce139a4): java.io.FileNotFoundException: /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/3.49.6-3.140922_11-33-57.01.pattern (No such file or directory)

$ cd /uod/idr/metadata/
$ ls
idr0010-doil-dnadamage  idr0054-segura-tonsilhyperion

Do I need to clone all of https://github.com/IDR/idr-metadata here?

will-moore commented 1 year ago

Just confirming...

(base) [wmoore@pilot-zarr1-dev idr0026]$ mkdir test3
(base) [wmoore@pilot-zarr1-dev idr0026]$ cd test3 
(base) [wmoore@pilot-zarr1-dev test3]$ /opt/bioformats2raw/bioformats2raw-0.6.0-24/bin/bioformats2raw /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/3.49.6-3.140922_11-33-57.01.pattern 3.49.6-3.140922_11-33-57.01.ome.zarr
OpenJDK 64-Bit Server VM warning: You have loaded library /tmp/opencv_openpnp7313185331649622917/nu/pattern/opencv/linux/x86_64/libopencv_java342.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
2023-06-29 10:14:29,998 [main] WARN  loci.formats.in.BaseTiffReader - unknown creation date format: 2014-09-22 11:34:50

$ cat 3.49.6-3.140922_11-33-57.01.ome.zarr/0/.zattrs 
{
  "multiscales" : [ {
    "metadata" : {
      "method" : "loci.common.image.SimpleImageScaler",
      "version" : "Bio-Formats 0.6.10"
    },
    "axes" : [ {
      "name" : "t",
      "type" : "time"
    }, {
      "name" : "c",
      "type" : "channel"
    }, {
      "name" : "z",
      "type" : "space"
    }, {
      "name" : "y",
      "type" : "space"
    }, {
      "name" : "x",
      "type" : "space"
    } ],
    "name" : "11-33-57_PMT - PMT [FD6_FDRED] [01]_Time Time0028.tif",
    "datasets" : [ {
      "path" : "0",
      "coordinateTransformations" : [ {
        "scale" : [ 1.0, 1.0, 1.0, 1.0, 1.0 ],
        "type" : "scale"
      } ]
    }, {
      "path" : "1",
      "coordinateTransformations" : [ {
        "scale" : [ 1.0, 1.0, 1.0, 2.0, 2.0 ],
        "type" : "scale"
      } ]
    } ],
    "version" : "0.4"
  } ],
  "omero" : {
    "channels" : [ {
      "color" : "808080",
      "coefficient" : 1,
      "active" : true,
      "label" : "Channel 0",
      "window" : {
        "min" : 376.0,
        "max" : 15788.0,
        "start" : 376.0,
        "end" : 15788.0
      },
      "family" : "linear",
      "inverted" : false
    } ],
    "rdefs" : {
      "defaultT" : 0,
      "model" : "greyscale",
      "defaultZ" : 0
    }
  }
}
sbesson commented 1 year ago

Do I need to clone all of https://github.com/IDR/idr-metadata here?

For the sake of testing, you might just want to copy the single .pattern file you want to test directly. Otherwise, yes need t clone the whole repository unless @francesw wants to look into extracting idr0026 into a standalone Git repository

Just confirming...

The inconsistency in the output is very concerning. Have you tried after fully deactivating Conda, not just your environment?

will-moore commented 1 year ago

Have you tried after fully deactivating Conda, not just your environment?

No. How do you do that?

sbesson commented 1 year ago

conda deactivate

will-moore commented 1 year ago

I already did that. How's that different from deactivating your environment?

I tried on a different machine... Cloned idr-metadata and moved it to /uod/idr/metadata

cd /uod/idr/metadata/
sudo -Es git clone git@github.com:IDR/idr-metadata.git
cd ../
sudo mv metadata/idr-metadata ./
sudo rm metadata   # symlink to /data/idr-metadata
sudo mv idr-metadata metadata

Then tried...

cd /data/idr0026/
wmoore@pilot-zarr2-dev idr0026]$ /opt/bioformats2raw/bioformats2raw-0.6.0-24/bin/bioformats2raw -p /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/3.49.6-3.140922_11-33-57.01.pattern 3.49.6-3.140922_11-33-57.01.ome.zarr

$ cat 3.49.6-3.140922_11-33-57.01.ome.zarr/0/.zattrs
{
  "multiscales" : [ {
    "metadata" : {
      "method" : "loci.common.image.SimpleImageScaler",
      "version" : "Bio-Formats 0.6.10"
    },
    "axes" : [ {
      "name" : "t",
      "type" : "time"
    }, {
      "name" : "c",
      "type" : "channel"
    }, {
      "name" : "z",
      "type" : "space"
    }, {
      "name" : "y",
      "type" : "space"
    }, {
      "name" : "x",
      "type" : "space"
    } ],
    "name" : "11-33-57_PMT - PMT [FD6_FDRED] [01]_Time Time0028.tif",
    "datasets" : [ {
      "path" : "0",
      "coordinateTransformations" : [ {
        "scale" : [ 1.0, 1.0, 1.0, 1.0, 1.0 ],
        "type" : "scale"
      } ]
    }, {
      "path" : "1",
      "coordinateTransformations" : [ {
        "scale" : [ 1.0, 1.0, 1.0, 2.0, 2.0 ],
        "type" : "scale"
      } ]
    } ],
    "version" : "0.4"
  } ],
  "omero" : {
    "channels" : [ {
      "color" : "808080",
      "coefficient" : 1,
      "active" : true,
      "label" : "Channel 0",
      "window" : {
        "min" : 376.0,
        "max" : 15788.0,
        "start" : 376.0,
        "end" : 15788.0
      },
      "family" : "linear",
      "inverted" : false
    } ],
    "rdefs" : {
      "defaultT" : 0,
      "model" : "greyscale",
      "defaultZ" : 0
    }
  }
}

WAT!?

sbesson commented 1 year ago

@will-moore I think I found the source of the issue. Can you try one more test with your last configuration, running sudo /opt/bioformats2raw/bioformats2raw-0.6.0-24/bin/bioformats2raw -p /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/3.49.6-3.140922_11-33-57.01.pattern 3.49.6-3.140922_11-33-57.01.ome.zarr (note the sudo at the beginning of the command)?

will-moore commented 1 year ago

👍 - yes that worked - 4 channels

sbesson commented 1 year ago

So the difference is really whether the directory containing the pattern file is writeable or not, meaning that a memo file can be saved or not. I suspect something incorrect is happening in the case where a memo file cannot be written and the wrong reader is used. It might be specific to either FilePatternReader or possibly the IDR customizations.

Either way for the scope of this testing, I think the solution is to re-run the conversion using the same user that owns the patterns directory

will-moore commented 1 year ago

OK I've chowned that dir to me...

(base) [wmoore@pilot-zarr1-dev ~]$ cd /uod/idr/metadata/idr0026-weigelin-immunotherapy/
(base) [wmoore@pilot-zarr1-dev idr0026-weigelin-immunotherapy]$ sudo chown wmoore patterns/

I see that in the original conversion I used --memo-directory ../memo which is an alternative solution, right? Except that probably also specified an existing dir that wasn't mine. However, the chown above seems to work, so now running full conversion in a screen

cd /data/idr0026/idr0026
for i in `ls /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/`; do echo $i; /home/wmoore/bioformats2raw-0.6.0-24/bin/bioformats2raw /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/$i ${i%.*}.ome.zarr; done
will-moore commented 1 year ago

Meanwhile, deleted invalid data previously uploaded...

$ ./mc rm --force --recursive uk1s3/idr0026/zarr
will-moore commented 1 year ago

The above removal of contents of the bucket ran very slowly and has only resulted in the removal of a handful of zarr filesets out of the 111 originally there.

Since we want to delete ALL the filesets uploaded, probably quicker to delete the bucket and recreate..

ran

./mc rb --force uk1s3/idr0026

This seemed to hang/time-out and doesn't seem to have had any affect:

$ ./mc ls uk1s3/idr0026/zarr | wc
     95     475    6719

Reverted to running the rm again in a screen

./mc rm --force --recursive uk1s3/idr0026/zarr
will-moore commented 1 year ago

@sbesson - Seems that the memo issue is something it would be good to fix (or at least warn) to prevent others suffering the pain above! I can create an issue somewhere, but where?

sbesson commented 1 year ago

From my side, the immediate candidates are:

Possibly the outstanding action would be to retest a similar scenario using bioformats2raw 0.7.0, a multi-channel pattern dataset and identify whether it's IDR specific. /cc @melissalinkert

will-moore commented 1 year ago

Ah - apologies @sbesson: I just realised you meant that there is probably just 1 issue (not 4) but it needs testing to determine where the issue lies!

sbesson commented 1 year ago

Retested with a simpler version of the pattern file with 2 timepoints compatible with upstream Bio-Formats

cat 3.49.6-3.140922_11-33-57.00.pattern 
/uod/idr/filesets/idr0026-weigelin-immunotherapy/20170222-symlinks/PNAS_2015/treatment start day 3/mouse 49/day 6-3/time lapse/140922_11-33-57/11-33-57_PMT - PMT [<BD2_GREEN,BD8_DEEPR,BD7_RED,FD6_FDRED>] [00]_Time Time<0000-0001>.tif

Placed a copy of this pattern file under patterns owned by a different user and executed the two following commands:

/opt/bioformats2raw/bioformats2raw-0.6.1/bin/bioformats2raw 3.49.6-3.140922_11-33-57.00.pattern 3.49.6-3.140922_11-33-57.00.zarr
 /opt/bioformats2raw/bioformats2raw-0.6.1/bin/bioformats2raw patterns/3.49.6-3.140922_11-33-57.00.pattern 3.49.6-3.140922_11-33-57.00_2.zarr

The .zattrs are identical between both conversion and contain omero metadata for the four channels specified in the pattern file:

 (base) [sbesson@pilot-zarr2-dev tmp]$ diff 3.49.6-3.140922_11-33-57.00.zarr/0/.zattrs 3.49.6-3.140922_11-33-57.00_2.zarr/0/.zattrs 
(base) [sbesson@pilot-zarr2-dev tmp]$ tail -n 50 3.49.6-3.140922_11-33-57.00.zarr/0/.zattrs
      },
      "family" : "linear",
      "inverted" : false
    }, {
      "color" : "00FF00",
      "coefficient" : 1,
      "active" : true,
      "label" : "BD8_DEEPR",
      "window" : {
        "min" : 401.0,
        "max" : 16188.0,
        "start" : 401.0,
        "end" : 16188.0
      },
      "family" : "linear",
      "inverted" : false
    }, {
      "color" : "0000FF",
      "coefficient" : 1,
      "active" : true,
      "label" : "BD7_RED",
      "window" : {
        "min" : 801.0,
        "max" : 8055.0,
        "start" : 801.0,
        "end" : 8055.0
      },
      "family" : "linear",
      "inverted" : false
    }, {
      "color" : "FF0000",
      "coefficient" : 1,
      "active" : false,
      "label" : "FD6_FDRED",
      "window" : {
        "min" : 250.0,
        "max" : 10867.0,
        "start" : 250.0,
        "end" : 10867.0
      },
      "family" : "linear",
      "inverted" : false
    } ],
    "rdefs" : {
      "defaultT" : 0,
      "model" : "color",
      "defaultZ" : 13
    }
  }
}

Based on the above, I am leaning towards options 1 and 2 i.e. it's an IDR/bioformats specific issue which probably will be classified as wontfix as one of the aims of the ongoing conversion work is to get rid of this fork entirely

will-moore commented 1 year ago

Started creating zips in a Screen

cd /data/idr0026
for i in */; do zip -r "${i%/}.zip" "$i"; done
will-moore commented 1 year ago

With all the previous Filesets deleted from s3, uploaded just a couple of different new ones to test...

(base) [wmoore@pilot-zarr1-dev ~]$ ./mc cp -r /data/idr0026/3.49.6-3.140922_11-33-57.00.ome.zarr uk1s3/idr0026/zarr/3.49.6-3.140922_11-33-57.00.ome.zarr
...zarr/OME/METADATA.ome.xml: 2.78 GiB / 2.78 GiB ━━━━━━━━━━━━━━━ 61.69 MiB/s 46s(base) 
(base) [wmoore@pilot-zarr1-dev ~]$ ./mc cp -r /data/idr0026/7.56.10-3.140926_14-52-18.03.ome.zarr uk1s3/idr0026/zarr/7.56.10-3.140926_14-52-18.03.ome.zarr
...zarr/OME/METADATA.ome.xml: 7.69 GiB / 7.69 GiB ━━━━━━━━━━━━━━━ 73.85 MiB/s 1m46s

Ooops - got an extra directory in there, but the images look good:

https://ome.github.io/ome-ngff-validator/?source=https://uk1s3.embassy.ebi.ac.uk/idr0026/zarr/7.56.10-3.140926_14-52-18.03.ome.zarr/7.56.10-3.140926_14-52-18.03.ome.zarr/0/

will-moore commented 1 year ago

Uploading zips to BioStudies...

(base) [wmoore@pilot-zarr1-dev bin]$ ./ascp -P33001 -i ../etc/asperaweb_id_dsa.openssh -d /data/idr0026/idr0026 bsaspera_w@hx-fasp-1.ebi.ac.uk:5f/136e8d-xxxxxxxxx
will-moore commented 1 year ago
(base) [wmoore@pilot-zarr1-dev data]$ sudo rm -rf idr0026/
will-moore commented 1 year ago

Currently we have 20 out of 111 Filesets "viewable" at https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/pages/S-BIAD860.html...

idr0026/3.66.6-3.141020_15-41-29.02.ome.zarr,S-BIAD860/04219d38-3c9a-4ed7-97ba-65e8538b1e73,23273
idr0026/3.67.9-6.141023_12-39-26.04.ome.zarr,S-BIAD860/1506a279-9c9d-4fcc-b5ff-a89bacb80c11,23335
idr0026/3.66.6-3.141020_17-15-27.04.ome.zarr,S-BIAD860/1d535c04-916e-47a7-857f-f731aa1f1951,23280
idr0026/3.65.6-3.141020_15-39-00.02.ome.zarr,S-BIAD860/1e0d94df-af47-432e-917f-48687290f336,23377
idr0026/3.65.6-3.141020_17-15-07.04.ome.zarr,S-BIAD860/2e2d2806-53df-4c35-a9be-25c7ca53699d,23384
idr0026/3.66.9-6.141020_15-41-29.01.ome.zarr,S-BIAD860/2f3e36a6-05d8-4a60-9f4e-d8b87e5d8fdf,23302
idr0026/3.66.9-6.141020_15-41-29.00.ome.zarr,S-BIAD860/3b8e0297-c95c-4460-adf8-75a29bfc132b,23301
idr0026/3.66.9-6.141020_15-41-29.03.ome.zarr,S-BIAD860/487f0bdd-a020-4cff-bfcb-887edd21c9ca,23304
idr0026/3.66.6-3.141020_15-41-29.04.ome.zarr,S-BIAD860/4dab8ca2-3511-43c0-a0e9-9ec1a87aabb6,23275
idr0026/3.66.9-6.141023_15-49-01.00.ome.zarr,S-BIAD860/519ad2f4-0f5a-4ad4-ac6f-5535573f11bf,23311
idr0026/3.66.6-3.141020_15-41-29.03.ome.zarr,S-BIAD860/5a578c22-3dac-456a-ac08-b240c85c7b8a,23274
idr0026/7.51.10-3.140926_10-43-58.00.ome.zarr,S-BIAD860/7b7cc2ee-5dfd-445d-a0b4-4f58448486d0,23415
idr0026/3.65.6-3.141020_15-39-00.04.ome.zarr,S-BIAD860/7ee9776a-95ec-4861-950e-c6f0884ef27b,23379
idr0026/7.48.10-3.140926_12-18-43.00.ome.zarr,S-BIAD860/9640c08d-8cba-4e32-a32d-f593b230fadf,23445
idr0026/7.48.10-3.140926_12-18-43.02.ome.zarr,S-BIAD860/a1b618b9-4e99-4c91-95d5-fbcf45f44109,23447
idr0026/3.65.9-6.141023_15-45-09.03.ome.zarr,S-BIAD860/aa0dece8-179b-4f72-9468-df0ad91a1c20,23408
idr0026/3.66.6-3.141020_15-41-29.01.ome.zarr,S-BIAD860/aef6ffa0-5360-49f2-aa89-1f52b924cc3a,23272
idr0026/3.64.9-6.141023_12-21-30.02.ome.zarr,S-BIAD860/cc6b7eac-c829-463f-aa52-14007014da5b,23397
idr0026/7.51.10-3.140926_10-43-58.03.ome.zarr,S-BIAD860/d6a19971-d7f4-47e0-beb1-77788de12d93,23418
idr0026/7.51.10-3.140926_10-43-58.02.ome.zarr,S-BIAD860/dd0be90a-ff66-410c-86b8-63d3fb6faedb,23417
for r in $(cat idr0026.csv); do
  biapath=$(echo $r | cut -d',' -f2)
  uuid=$(echo $biapath | cut -d'/' -f2)
  fsid=$(echo $r | cut -d',' -f3)
  omero mkngff sql --symlink_repo /data/OMERO/ManagedRepository --secret=$SECRET $fsid "/bia-integrator-data/$biapath/$uuid.zarr" > "$fsid.sql"
done

...Found prefix demo_2/2017-04/13 // 07-17-06.573 for fileset 23418
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2017-04/13/07-17-06.573
Creating dir at /data/OMERO/ManagedRepository/demo_2/2017-04/13/07-17-06.573_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2017-04/13/07-17-06.573_mkngff/d6a19971-d7f4-47e0-beb1-77788de12d93.zarr -> /bia-integrator-data/S-BIAD860/d6a19971-d7f4-47e0-beb1-77788de12d93/d6a19971-d7f4-47e0-beb1-77788de12d93.zarr
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/2017-04/13 // 07-06-10.670 for fileset 23417
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2017-04/13/07-06-10.670
Creating dir at /data/OMERO/ManagedRepository/demo_2/2017-04/13/07-06-10.670_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2017-04/13/07-06-10.670_mkngff/dd0be90a-ff66-410c-86b8-63d3fb6faedb.zarr -> /bia-integrator-data/S-BIAD860/dd0be90a-ff66-410c-86b8-63d3fb6faedb/dd0be90a-ff66-410c-86b8-63d3fb6faedb.zarr
for r in $(cat idr0026.csv); do
  fsid=$(echo $r | cut -d',' -f3)
  psql -U omero -d idr -h $DBHOST -f "$fsid.sql"
done

...
BEGIN
 mkngff_fileset 
----------------
        5287479
(1 row)
COMMIT
BEGIN
 mkngff_fileset 
----------------
        5287480
(1 row)
COMMIT
will-moore commented 1 year ago

All good (missing thumbnails in screenshot are for images not included in the 20 updated by mkngff above:

Screenshot 2023-08-29 at 19 23 50

will-moore commented 1 year ago

Testing on idr-testing:omeroreadwrite...

Updated to today's OMEZarrReader.jar (only on omeroreadwrite server - not proxies).

Use all 111 Images in idr0026.csv - see https://github.com/IDR/idr-utils/pull/56/commits/003b3a33d40f12455cbc69db75573e051c22331c

Started mkngff at 10:37...

will-moore commented 1 year ago

mkngff just done (nearly 12:00). apply sql and view image on just readwrite server with ssh -A idr-testing.openmicroscopy.org -L 1080:omeroreadwrite:80 E.g. http://localhost:1080/webclient/?show=image-3261651

$ grep -A 2 "saved memo" /opt/omero/server/OMERO.server/var/log/Blitz-0.log | grep -A 2 "13-14-13.681_mkngff"
2023-09-12 10:59:40,189 DEBUG [                   loci.formats.Memoizer] (l.Server-2) saved memo file: /data/OMERO/BioFormatsCache/data/OMERO/ManagedRepository/demo_2/2017-04/12/13-14-13.681_mkngff/3e8c077e-5612-4ae1-a385-cfb5fb507822.zarr/OME/.METADATA.ome.xml.bfmemo (39334 bytes)
2023-09-12 10:59:40,189 DEBUG [                   loci.formats.Memoizer] (l.Server-2) start[1694516354319] time[25869] tag[loci.formats.Memoizer.setId]
2023-09-12 10:59:40,189 INFO  [                ome.io.nio.PixelsService] (l.Server-2) Creating BfPixelBuffer: /data/OMERO/ManagedRepository/demo_2/2017-04/12/13-14-13.681_mkngff/3e8c077e-5612-4ae1-a385-cfb5fb507822.zarr/OME/METADATA.ome.xml Series: 0

25869ms is 26 secs for setId

imagesc-bot commented 4 months ago

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/file-format-to-store-images-using-ngff-coverter/98320/10