will-moore / omero-mdv

GNU General Public License v3.0
1 stars 1 forks source link

Saving MDV configs in OMERO.mdv #2

Closed will-moore closed 11 months ago

will-moore commented 1 year ago

Aims

OMERO Intro:

We have various hierarchies in OMERO:

OMERO.tables are hdf-based tables that are File-Attachments typically linked to a "Container" e.g. Project, Dataset, Screen, Plate or Image with each table row annotating an Image or ROI within that container. OMERO.tables tend to have a column such as Image or ROI that identifies the object each row applies to.

Also possible to have CSV files in similar structure.

OMERO also supports Key-Value pairs which are named-value annotations typically linked to Images (or possibly Wells) and Tags (simple label) linked to Images.

Loading Data

I'd like users to be able to choose a bunch of data-source in OMERO and combine them into a single MDV table:

An important question is how to combine this data - ie: what is the "Primary Key" we use when combining tables? This will mostly be Image ID if we're combining Key-Value pairs or Dataset names with OMERO.tables. But it could be ROI ID if combining multiple OMERO.tables or csv files. Also important is the ordering of rows. MDV expects all columns of data to be streamed as if from one table (in the same ordering). So, if we combine Key-Value pairs or Dataset Names with an OMERO.table, we should order all the data sources the same.

E.g. choose OMERO.table-ID:1 and All Key-Value pairs on Images in a Project.

OMERO.table ID:1 could be like this

Image Cell_Count
123 10
456 11
234 12

And we wish to combine it with Key-Value pairs like:

- Image:123   Drug: Nocodazole, Dead cells: True
- Image:234  Drug: DMSO, Dead cells: False
- Image:456  Drug: X

Then we'd need to stream data to MDV in columns ordered according to Image ID like:

Image Cell_Count Drug Dead cells
123 10 Nocodazole True
456 11 Drug X
234 12 DMSO False

So when MDV makes an API call for the "Drug" column we need to know that this should be re-ordered according to the Image column of OMERO.table ID:1. This means that there is a bit more work to do when loading data: For every "column" of the Key-Value pairs data, we need to load all the Key-Value pairs for those images, load the Image IDs from the OMERO table and pick the Value for "Drug" for each image in turn!

One option to improve this performance is to copy all the Key-Value pairs into a new OMERO.table at the start (when user chooses to view this data in MDV). However this copied data could get stale if the user updates Key-Value pairs in OMERO.

If the user doesn't include an OMERO.table in the data they want to view in MDV, e.g. simply combine Key-Value Pairs with Dataset Names, then we need some other consistent ordering of Images. E.g. simply sort data by Image IDs? But this could still be problematic if the user adds an Image to the Dataset after opening of MDV (column lengths etc calculated) but before the loading of a column of data.

Combining Images and ROIs

When we're dealing with ROIs, we may want to combine ROI data with Image data. E.g. an OMERO.table of ROIs combined with the Image Names or Dataset Names.

ROI table:

Image ROI Width Area
123 100 13.5 54.0
123 101 15.1 63.2
123 102 45.0 150
456 103 16.4 200.1

Dataset Names:

 - Image:123.     Treatment
 - Image:456.     Control

We could combine these into a single table, e.g. created a single csv table in https://youtu.be/X5EvQQGScYM?si=R_TuJWkgSN8jQiER&t=44

Image ROI Width Area Dataset
123 100 13.5 54.0 Treatment
123 101 15.1 63.2 Treatment
123 102 45.0 150 Treatment
456 103 16.4 200.1 Control

To load data for the "Dataset" column, we'd again need to know the order of IDs in the ROI column, load Dataset Names - (joined with Images and ROIs) and return Dataset Names ordered according to ROI IDs.

Alternative is possibly to use multiple Tables in MDV - One to represent Images and another to represent ROIs?

OMERO-MDV config file

All the data sources for a single MDV view could all be specified in a single OMERO.mdv JSON file, saved to OMERO as a FileAnnotation (same as OMERO.figure). This could also be used to save the MDV views that a user creates. When they open MDV initially, we could generated a starting view that just shows the table or a single plot and maybe image viewer (as we currently do https://user-images.githubusercontent.com/900055/245738983-969b86c9-f44b-479d-8abe-3b20e568cf5d.png).

Then we can open that OMERO.mdv-config JSON in MDV with /omero-mdv/?dir=file/ID/ and OMERO.mdv would know how to generate the response to:

E.g: columns is like in datasources.json but with info on which OMERO class/object the data comes from, and possibly the byte-length of the data in each, which can be used to generate the datasource_name.json response.

{
  "datasources": [
    {
      "name": "cells",
      "columns": [
        { "name": "Gene", "datatype": "text", "omero": "MapAnnotation", "bytes": 1234},
        { "name": "Cell count", "datatype": "integer", "omero": "Table:123", "bytes": 2345},
        { "name": "Cell area", "datatype": "double", "omero": "Table:123", "bytes": 4589},
      ]
    }
  ],
 "views": [
    "main": {"initialCharts": {"cells": [...]} }
    "dashboard": {"initialCharts": {"cells": [...]} }
  ]
}

Saving Annotations

Annotations added in MDV could be saved as Key-Value pairs. I don't know if it's possible to edit existing data (Key-Value pairs, Tags, Ratings) and save to update them in OMERO?

cc @xinaesthete @martinSergeant