ome / omero-cli-zarr

https://pypi.org/project/omero-cli-zarr/
GNU General Public License v2.0
15 stars 10 forks source link

Support exporting dataset #83

Closed dominikl closed 2 years ago

dominikl commented 2 years ago

Attempt to add 'export dataset' functionality. Exports all images of a dataset into a directory with the dataset id as name. Export of the images happens in parallel using dask. But something's not quite right. After running for a while you'll get a out of Java heap space error on the server side:

...
File "/home/dlindner/miniconda3/lib/python3.9/site-packages/omero_api_RawPixelsStore_ice.py", line 1199, in getPlane
    return _M_omero.api.RawPixelsStore._op_getPlane.invoke(self, ((z, c, t), _ctx))
omero.InternalException: exception ::omero::InternalException
{
    serverStackTrace = ome.conditions.InternalException:  Wrapped Exception: (java.lang.OutOfMemoryError):
Java heap space
        at loci.formats.tiff.TiffParser.getSamples(TiffParser.java:1030)

Is that an error on the client side (session not closed, etc.) or a server side issue? Any ideas @sbesson @joshmoore ?

joshmoore commented 2 years ago

I don't see a missed close() or similar, so perhaps just the total number of calls to https://github.com/ome/omero-cli-zarr/blob/5d66d65f6b0cf955e4d07f44ac75a1a921825493/src/omero_zarr/raw_pixels.py#L111 ?

will-moore commented 2 years ago

@dominikl That's great. I wonder if you could add .zattrs into the top-level Dataset dir, to list the paths to the images.

"collection": {
  "images": {
      "image1": {}, "image2": {}, "image3": {}, "image4": {}, "image5": {},
  },

I think that represents the summary of the discussion at https://github.com/ome/ngff/issues/31

We probably want to consider some different naming of Dataset and Images (not just ID.zarr)? That's not part of the spec, but if you're browsing the "collection" in another client, you probably want to know the Image names at the top, instead of having to open the .zattrs of each Image to get the names.

I guess we could do:

"collection": { "images": { "123.zarr": {"name": "image1.tiff"}, "456.zarr": {"name": "image2.tiff"}, },


but that's less nice. Using IDs as paths on the file-system is also not so human readable.

I wonder if we want to convert e.g. "image1.tiff" into "image1" since it's not really a Tiff anymore!?
dominikl commented 2 years ago

Yes, I guess in the end the dataset has to be a "zarr" itself with the appropriate metadata instead of just a directory. The PR is more a draft, trying to export images in parallel, which unfortunately doesn't work very well, needs some more debugging. I agree too, names are nicer than meaningless Ids, but leads to conflicts.

sbesson commented 2 years ago

Even without the addition of .zattrs as the discussion is ongoing, I would say structuring the top-level directory as a Zarr group should already allow the different images to be grouped in agreement with the Zarr specification.

Incidentally this raises the question of whether ome_zarr info should be updated to detect and report on this type of hierarchies. Another use case would be the Zarr groups created by bioformat2raw.

dominikl commented 2 years ago

Tested locally, doesn't work very well either. After a while the export dies with (even with only 3 threads):

ERROR:omero.gateway:Failed to getPlane() or getTile() from rawPixelsStore
Traceback (most recent call last):
  File "/Users/dom/miniconda3/envs/zarr/lib/python3.9/site-packages/omero/gateway/__init__.py", line 7466, in getTiles
    rawPlane = rawPixelsStore.getPlane(z, c, t)
  File "/Users/dom/miniconda3/envs/zarr/lib/python3.9/site-packages/omero/gateway/__init__.py", line 4796, in __call__
    return self.handle_exception(e, *args, **kwargs)
  File "/Users/dom/miniconda3/envs/zarr/lib/python3.9/site-packages/omero/gateway/__init__.py", line 4793, in __call__
    return self.f(*args, **kwargs)
  File "/Users/dom/miniconda3/envs/zarr/lib/python3.9/site-packages/omero_api_RawPixelsStore_ice.py", line 1199, in getPlane
    return _M_omero.api.RawPixelsStore._op_getPlane.invoke(self, ((z, c, t), _ctx))
Ice.UnknownLocalException: exception ::Ice::UnknownLocalException
{
    unknown = ../../include/Ice/BasicStream.h:307: Ice::EncapsulationException:
protocol error: illegal encapsulation
  0 IceUtil::Exception::Exception(char const*, int) in /opt/ice-3.6.5/bin/../lib64/libIceUtil.so.36
  1 Ice::LocalException::LocalException(char const*, int) in /opt/ice-3.6.5/bin/../lib64/libIce.so.36
...

Does that mean the RawPixelsStore simply can't be called in parallel? Or is the parallel omero.cli.cli_login() call in order to create sessions the issue, and if so is there an alternative?

joshmoore commented 2 years ago

Does that mean the RawPixelsStore simply can't be called in parallel?

A single RawPixelsStore should block multiple access, i.e. it should be thread-safe but not concurrent. Can you find the underlying exception that was thrown?

dominikl commented 2 years ago

It was actually an out of heap space error too. Ran it with just one thread, got it too:

2021-09-22 11:54:35,192 ERROR [            ome.services.throttling.Task] (l.Server-0) Failed to invoke: ome.services.throttling.Callback@4c7969ed (omero.api._AMD_RawPixelsStore_getPlane@47a90f73 )
java.lang.reflect.InvocationTargetException: null
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:na]
...
Caused by: java.lang.OutOfMemoryError: Java heap space

Is there a memory leak somewhere on the server side when repeatedly calling RawPixelsStore.getPlane()?

joshmoore commented 2 years ago

The only route I know of is If the pixel store doesn't get closed and more are created.

dominikl commented 2 years ago

I think the problem is that I create a omero.gateway.BlitzGateway(client_obj=c.get_client()) for each image, but don't close it (so probably the pixel store isn't closed either). However I can't close it as it also kills the session. Is there a way to close the BlitzGateway without closing the session @will-moore ?

will-moore commented 2 years ago

the add_image() uses BlitzGateway's getPlanes() -> getTiles() and this should close the raw pixel store after the last plane is requested: https://github.com/ome/omero-py/blob/e99bcbbdfdbfffdff9beaa954b5bb2143f23effa/src/omero/gateway/__init__.py#L7553

You could also try conn.close(False) which should close all the various proxy services but not kill the session: https://github.com/ome/omero-py/blob/e99bcbbdfdbfffdff9beaa954b5bb2143f23effa/src/omero/gateway/__init__.py#L1965

dominikl commented 2 years ago

Thanks. I tried conn.close(False) which seemed to work, but after a while it also closes the session. I found a way now which works. But the RawPixelsstore has to be closed explicitly. There seems to be a leak somewhere in getPlanes()/getTiles(), but I couldn't see it, the methods look fine to me.

will-moore commented 2 years ago

I wonder if some images aren't getting all their planes downloaded - so they don't get closed. Might be interesting to see if conn.c.getStatefulServices() gives you anything when you're closing the pixelstore and log e.g. image name to check if the image all got downloaded OK. Also see https://github.com/ome/omero-py/blob/master/src/omero/gateway/__init__.py#L1650 conn._assert_unregistered() that does this.

NB: We discussed an alternative export strategy (when you don't have access to the binary repo):

This means that the server is doing much less work and maybe more parallelisation is possible. I could start looking at that approach if it sounds like a good idea (and if you don't want to take it on)? Potential down-side: possible to download more data than you want e.g. for MIF where some images are in a different Dataset, you'd still download them.

dominikl commented 2 years ago

👍 Would be good to have different strategies. So far there would be three different ways to do it:

will-moore commented 2 years ago

So, with a bit more testing, I realise that my approach of blindly using the Image name for the zarr image group causes issues with various characters, such as / (obviously) but also [ and ] I think. Needs a thorough test of all special characters to see what needs replacing or escaping. Happy to look into this...?

dominikl commented 2 years ago

Ah true, you'd have to escape all posix and windows special characters, that can be tricky. How about using UUIDs as directory names (instead of the arbitrary omero IDs), and put the names into the metadata? That means one always has to lookup the metadata to make sense of the directory structure, but saves the hassle of handling all sorts of special characters.

dominikl commented 2 years ago

Superseded by #88