OMERO.tables are hdf-based tables that are File-Attachments typically linked to a "Container" e.g. Project, Dataset, Screen, Plate or Image with each table row annotating an Image or ROI within that container. OMERO.tables tend to have a column such as Image or ROI that identifies the object each row applies to.
Also possible to have CSV files in similar structure.
OMERO also supports Key-Value pairs which are named-value annotations typically linked to Images (or possibly Wells) and Tags (simple label) linked to Images.
Loading Data
I'd like users to be able to choose a bunch of data-source in OMERO and combine them into a single MDV table:
OMERO.tables / csv files
Key-Value pairs: e.g. all the values for key "Gene", or maybe "All" KV-pairs
Image Names and Dataset Names (probably should be included by default, Well Names for SPW data) These may already be in OMERO.tables
Tags, Ratings, ROI count per Image?
An important question is how to combine this data - ie: what is the "Primary Key" we use when combining tables? This will mostly be Image ID if we're combining Key-Value pairs or Dataset names with OMERO.tables. But it could be ROI ID if combining multiple OMERO.tables or csv files.
Also important is the ordering of rows. MDV expects all columns of data to be streamed as if from one table (in the same ordering). So, if we combine Key-Value pairs or Dataset Names with an OMERO.table, we should order all the data sources the same.
E.g. choose OMERO.table-ID:1 and All Key-Value pairs on Images in a Project.
OMERO.table ID:1 could be like this
Image
Cell_Count
123
10
456
11
234
12
And we wish to combine it with Key-Value pairs like:
- Image:123 Drug: Nocodazole, Dead cells: True
- Image:234 Drug: DMSO, Dead cells: False
- Image:456 Drug: X
Then we'd need to stream data to MDV in columns ordered according to Image ID like:
Image
Cell_Count
Drug
Dead cells
123
10
Nocodazole
True
456
11
Drug X
234
12
DMSO
False
So when MDV makes an API call for the "Drug" column we need to know that this should be re-ordered according to the Image column of OMERO.table ID:1. This means that there is a bit more work to do when loading data: For every "column" of the Key-Value pairs data, we need to load all the Key-Value pairs for those images, load the Image IDs from the OMERO table and pick the Value for "Drug" for each image in turn!
One option to improve this performance is to copy all the Key-Value pairs into a new OMERO.table at the start (when user chooses to view this data in MDV). However this copied data could get stale if the user updates Key-Value pairs in OMERO.
If the user doesn't include an OMERO.table in the data they want to view in MDV, e.g. simply combine Key-Value Pairs with Dataset Names, then we need some other consistent ordering of Images. E.g. simply sort data by Image IDs? But this could still be problematic if the user adds an Image to the Dataset after opening of MDV (column lengths etc calculated) but before the loading of a column of data.
Combining Images and ROIs
When we're dealing with ROIs, we may want to combine ROI data with Image data. E.g. an OMERO.table of ROIs combined with the Image Names or Dataset Names.
To load data for the "Dataset" column, we'd again need to know the order of IDs in the ROI column, load Dataset Names - (joined with Images and ROIs) and return Dataset Names ordered according to ROI IDs.
Alternative is possibly to use multiple Tables in MDV - One to represent Images and another to represent ROIs?
OMERO-MDV config file
All the data sources for a single MDV view could all be specified in a single OMERO.mdv JSON file, saved to OMERO as a FileAnnotation (same as OMERO.figure). This could also be used to save the MDV views that a user creates. When they open MDV initially, we could generated a starting view that just shows the table or a single plot and maybe image viewer (as we currently do https://user-images.githubusercontent.com/900055/245738983-969b86c9-f44b-479d-8abe-3b20e568cf5d.png).
Then we can open that OMERO.mdv-config JSON in MDV with /omero-mdv/?dir=file/ID/ and OMERO.mdv would know how to generate the response to:
file/ID/datasources.json
Column names would need to be unique (avoid duplicate column names from e.g. csv file and OMERO.table or Key-Value pair)
A UI tool/form to allow a user to pick the OMERO data to include could also specify the columns (and names).
file/ID/state.json - lists the views
file/ID/views.json - loads views JSON directly from the config file
file/ID/datasource_name.json
{"col_name": [byte0, byte1]...}
file/ID/datasource_name.b with byte Range start-end
Use the config to determine correct column from the start-end bytes.
E.g: columns is like in datasources.json but with info on which OMERO class/object the data comes from, and possibly the byte-length of the data in each, which can be used to generate the datasource_name.json response.
Annotations added in MDV could be saved as Key-Value pairs.
I don't know if it's possible to edit existing data (Key-Value pairs, Tags, Ratings) and save to update them in OMERO?
Aims
OMERO Intro:
We have various hierarchies in OMERO:
OMERO.tables are hdf-based tables that are File-Attachments typically linked to a "Container" e.g. Project, Dataset, Screen, Plate or Image with each table row annotating an Image or ROI within that container. OMERO.tables tend to have a column such as
Image
orROI
that identifies the object each row applies to.Also possible to have CSV files in similar structure.
OMERO also supports Key-Value pairs which are named-value annotations typically linked to Images (or possibly Wells) and Tags (simple label) linked to Images.
Loading Data
I'd like users to be able to choose a bunch of data-source in OMERO and combine them into a single MDV table:
An important question is how to combine this data - ie: what is the "Primary Key" we use when combining tables? This will mostly be Image ID if we're combining Key-Value pairs or Dataset names with OMERO.tables. But it could be ROI ID if combining multiple OMERO.tables or csv files. Also important is the ordering of rows. MDV expects all columns of data to be streamed as if from one table (in the same ordering). So, if we combine Key-Value pairs or Dataset Names with an OMERO.table, we should order all the data sources the same.
E.g. choose OMERO.table-ID:1 and All Key-Value pairs on Images in a Project.
OMERO.table ID:1 could be like this
And we wish to combine it with Key-Value pairs like:
Then we'd need to stream data to MDV in columns ordered according to Image ID like:
So when MDV makes an API call for the "Drug" column we need to know that this should be re-ordered according to the Image column of OMERO.table ID:1. This means that there is a bit more work to do when loading data: For every "column" of the Key-Value pairs data, we need to load all the Key-Value pairs for those images, load the Image IDs from the OMERO table and pick the Value for "Drug" for each image in turn!
One option to improve this performance is to copy all the Key-Value pairs into a new OMERO.table at the start (when user chooses to view this data in MDV). However this copied data could get stale if the user updates Key-Value pairs in OMERO.
If the user doesn't include an OMERO.table in the data they want to view in MDV, e.g. simply combine Key-Value Pairs with Dataset Names, then we need some other consistent ordering of Images. E.g. simply sort data by Image IDs? But this could still be problematic if the user adds an Image to the Dataset after opening of MDV (column lengths etc calculated) but before the loading of a column of data.
Combining Images and ROIs
When we're dealing with ROIs, we may want to combine ROI data with Image data. E.g. an OMERO.table of ROIs combined with the Image Names or Dataset Names.
ROI table:
Dataset Names:
We could combine these into a single table, e.g. created a single csv table in https://youtu.be/X5EvQQGScYM?si=R_TuJWkgSN8jQiER&t=44
To load data for the "Dataset" column, we'd again need to know the order of IDs in the ROI column, load Dataset Names - (joined with Images and ROIs) and return Dataset Names ordered according to ROI IDs.
Alternative is possibly to use multiple Tables in MDV - One to represent Images and another to represent ROIs?
OMERO-MDV config file
All the data sources for a single MDV view could all be specified in a single OMERO.mdv JSON file, saved to OMERO as a FileAnnotation (same as OMERO.figure). This could also be used to save the MDV views that a user creates. When they open MDV initially, we could generated a starting view that just shows the table or a single plot and maybe image viewer (as we currently do https://user-images.githubusercontent.com/900055/245738983-969b86c9-f44b-479d-8abe-3b20e568cf5d.png).
Then we can open that OMERO.mdv-config JSON in MDV with
/omero-mdv/?dir=file/ID/
and OMERO.mdv would know how to generate the response to:{"col_name": [byte0, byte1]...}
Range start-end
start-end
bytes.E.g: columns is like in
datasources.json
but with info on which OMERO class/object the data comes from, and possibly the byte-length of the data in each, which can be used to generate thedatasource_name.json
response.Saving Annotations
Annotations added in MDV could be saved as Key-Value pairs. I don't know if it's possible to edit existing data (Key-Value pairs, Tags, Ratings) and save to update them in OMERO?
cc @xinaesthete @martinSergeant