Open sbesson opened 4 years ago
Few thoughts:
bfconvert
?At the least, when downloader offers some friendly customizable directory structure (even if just as links into the weird one) it would be good to ensure that it's possible to express whatever the string contexts here can. Oddly I can't currently find a corresponding downloader issue/card but it's on the radar anyway.
I spent a minimal amount of time on omero-downloader
testing the binary retrieval of an entire dataset for comparison. At the moment, the layout produced by the utility is very focused on mimicking the state of the source database:
Fileset/<fileset_id>/Binary/<filename>
Image/<image_id>/Binary/<filename>
Repository/<repository_id>/<username_userID>/<YYYY-MM>/...
While this is probably the most unambiguous construction, two comments specifically in the context of an export/import workflow:
Project/Dataset
information seems to be lost. For instance it would not be possible to reimport the same structure without additional informationRe handling multiple files, I think this is still not a feature of the core YAML. Libraries could certainly have their own implementation like we already do in bulk imports for instance. In the mid-run, this might be another data type that we would like the next-generation file format to gracefully handle.
Generally, all for unifying the semantics of the patterns for our core concept (image name...) across the board wherever possible - from Bio-Formats to OMERO.
OMERO.downloader is heavily tied to the OME data model and what OMERO provides for working with it. Container data is lost because OmeroMetadata
is so very incomplete, ideally that would be code-generated; it's an even more stark omission for HCS. Also rendering settings are not captured in the data model. Were they then they would probably end up in some subdirectory of Image/<image_id>/
but on the roadmap for a friendly UI (needs a design phase first) is to allow specifying an arbitrary layout that can copy or symlink into the server-side layout that is needed for knowing what's already downloaded and how to assemble XML from it.
(Or a "downloader gateway" could simply know how the folder layout works and provide nice API calls for opening data and navigating the links.)
Is there a reason why we can't just get your two commits into the render plugin @sbesson ? This would be very helpful. For most IDR datasets these days "cloning" the rendering settings from the pilot into the production system is quite a tedious task, which would be very much simplified by this.
@dominikl Sorry for dropping the ball on this. A few thoughts to try and move forward :
idr0067
as well as https://github.com/ome/omero-cli-render/pull/50 make use of folders as a way to represent the hierarchy while https://github.com/ome/omero-cli-render/pull/52 make use of JSON/YAML structures. The proposed upcoming OME-Zarr on collections & rendering might lead us to explore an hybrid solution where the rendering settings would still be distributed in different folders but while relaxing the constraint that the subfolder path must match the project/dataset structure. Instead this hierarchy could be stored in the top-level metadata e.g.
experimentA/
.zattrs # metadata containing project/dataset information
image_1/
.zattrs # metadata including the rendering settings
image_2/
.zattrs # metadata including the rendering settings
...
As the above is largely unspecified and will take several itereations, I think it makes sense to introduce an intermediate extension either the CLI render spec that supports our needs. Either way it would be useful to keep the above in mind as we agree on the layout. How many use cases do we want to support? HCS? dataset-level? image-level?
The initial logic of the rendering plugin has been primarily driven by the high-content screening use case i.e. fairly homogeneous imaging datasets where a single rendering file can be applied to all images within a plate. In the case of IDR experiments, datasets can be much more heterogeneous with mixed dimensionalities and modalities. In many cases, a single set of rendering settings is not sufficient.
The https://github.com/IDR/idr0067-king-yeastmeiosis was fairly representative in the sense that datasets contained fluorescence multi-C multi-Z, fluorescence multi-C maximally projected and single channel brightfield images and each image needed to be adjusted individually. During the curation, I used a branch with the following changes:
render_images
API is expanded to include a string context in addition to list of images. This context is then used for creating the directory tree, e.g. if the command target is a project, the layout is<project_name>/<dataset_name>/<image_name>.yml
set
command to be able to consume rendering settings using the layout above.With both changes,
bin/omero render info -f Project:<id>
was used to recursively extract the imported rendering settings. After curationbin/omero render set -f Project:<id>
was invoked to recursively update all images in the project:Based on recent IDR examples, my impression is that such a functionality would be valuable to backport to the
render
plugin. Before consolidating the prototype as well as adding tests, I am opening as an issue in case we need to capture similar use cases that should be also considered like:-f %p/%d/%i.yml
/cc @joshmoore @dominikl