mobie / mobie-viewer-fiji

BSD 2-Clause "Simplified" License
33 stars 13 forks source link

Scatter plots for grid views #343

Open K-Meech opened 3 years ago

K-Meech commented 3 years ago

Grid views currently do not support scatter plots. This would be very useful for data with a lot of whole-image related measurements.

tischi commented 3 years ago

@K-Meech @constantinpape

I had a look and I could hack something together but I think it would be better if we spent some time on this to make it proper.

The point is that I think that the GridSourceTransformer and the SegmentationSourceDisplay will have a lot of common fields (both in the java code and in the JSON spec). Essentially everything that has to do with the table, the scatterplot and "segements/grid position" selection and coloring modes and so on will be identical.

Thus, I feel on the Java side those two classes should probably both extend the same class, maybe something like ImageRegionDisplay, which could contain all the common fields.

@constantinpape how are things looking on the JSON spec side in this regard? Is there a way to "inherit specifications" or will the GridSourceTransformer and SegmentationSourceDisplay simply have a lot of common fields?

constantinpape commented 3 years ago

Having a common base class sounds like the way to go.

@constantinpape how are things looking on the JSON spec side in this regard? Is there a way to "inherit specifications" or will the GridSourceTransformer and SegmentationSourceDisplay simply have a lot of common fields?

jsonschema does not directly support inheritance, but it's possible to have references. So we could have external definitions of all the shared attributes and then just reference them in the grid transformation and segmentation display spec to avoid duplication.

tischi commented 3 years ago

Writing this, both in Java and in the spec it feels that the GridSourceTransformer is doing a bit too much right now, since it is both transforming the sources to land at the right grid positions, but is also does all the table stuff. I wonder whether we could separate these two funtionalities on the spec side? I think it comes back to the discussion that we had in zoom: Maybe we can somehow have a GridSourceDisplay in the JSON, which can have the table and scatterplot stuff and would live in parallel to the GridSourceTransformer?

tischi commented 3 years ago

...I feel if we find a way to do this it would probably also fix the issue that we discussed with the many image tables for a plate view...

K-Meech commented 3 years ago

I agree, having a GridSourceDisplay makes a lot of sense. In the end, it's making a new row in the MoBIE GUI for the grid, so it seems a lot more consistent to have this listed as a display (rather than only a transform)

I guess you could, rather than a sources field, have a gridTransform field where you give the name of the grid transform? Then the transform could stay where it is currently?

constantinpape commented 3 years ago

I agree that it makes sense to separate the gridTransform and gridTable / gridDisplay functionality more, but I am not so sure what the best option is:

tischi commented 3 years ago

I have a slightly a different proposal, which is more flexible as it does not tie it to a GridTransformer.

public class SourceAnnotationDisplay
{
    // Serialization
    protected String name;
    protected List< List< String > > sources; // DIFFERENT
    protected Map< TableDataFormat, StorageLocation > tableData; // DIFFERENT
    protected List< String > tables;
    protected String lut = ColoringLuts.GLASBEY;
    protected String colorByColumn;
    protected double opacity = 1.0;
    protected Double[] valueLimits = new Double[]{ null, null };
    protected List< String > selectedSegmentIds; // this would correspond to the "region names", I guess...
    protected boolean showScatterPlot = false;
    protected String[] scatterPlotAxes; // DIFFERENT (no default)

Where (I think as in the GridTransfromer),

  1. the table rows correspond to the indices of the List< List< String > > sources
  2. the annotation interval is drawn such that is covers the union of the sources at each "position", i.e. the inner list
  3. the tables must have one column called name (corresponding to the label_id).

This is almost identical to the SegmentationSourceDisplay, apart from the places that I marked as different:

public class SegmentationSourceDisplay
{
    // Serialization
    protected String name;
    protected double opacity = 1.0;
    protected List< String > sources;  // DIFFERENT
        protected List< String > tables;
    protected String lut = ColoringLuts.GLASBEY;
    protected String colorByColumn;
    protected Double[] valueLimits = new Double[]{ null, null };
    protected List< String > selectedSegmentIds;
    protected boolean showSelectedSegmentsIn3d = false; // DIFFERENT
    protected boolean showScatterPlot = false;
    protected String[] scatterPlotAxes = new String[]{ Constants.ANCHOR_X, Constants.ANCHOR_Y };

I am not sure if the name selectedSegmentIds is still the best if we are using it for both cases. Maybe something like selectedTableRowIds would be better?!

tischi commented 3 years ago

...oh, and I already know a beautiful use-case for this: the tomograms in the yeast-CLEM project, which are not in a grid but it could be still very nice to browse them via a interactive table 😃

tischi commented 3 years ago

Alternatively we could also think about promoting grid to a source by adding the option {"grid": {...}} as a new source type

We could do this in addition to my proposal; as it would not change anything but just be another way to specify the protected List< List< String > > sources in the proposed SourceAnnotationDisplay.

tischi commented 3 years ago

Regarding the selectedTableRowIds: maybe we should be more explicit and require that there actually is a column with that string? Right now we are creating the ID on the fly in Java combining multiple columns. This works, but I think it may actually complicate things both on the spec as well as on the Java side.

constantinpape commented 3 years ago

Ok, we decided to implement https://github.com/mobie/mobie-viewer-fiji/issues/343#issuecomment-876370201. This way the SourceAnnotationDisplay is independent from the grid transform, the grid transform just happens to arrange the same sources. SegmentationSourceDisplay and SourceAnnotationDisplay can share the same base class, but we decided to have selectedSegmentIds and selectedSourceAnnotationIds as separate fields in the inhereriting classes, so that we don't have to change the current logic for the segmentation tables.

Also, we will not introduce grid sources (for now, we can consider doing this later; I will open another issue about this).

I will update the spec and prepare an example in the zebrafish project.

K-Meech commented 3 years ago

Ok - great! Making it more generic is a great plan, as there are plenty of use cases for annotating / navigating sources that don't necessarily need to be arranged in a grid :)

I'm a bit unsure how these tables will look. Would they have a column for annotationId? Or they are just being matched based on the order of the rows being equal to order of the items in List< List< String > > sources?

constantinpape commented 3 years ago

@K-Meech that's a good point, we didn't fully discuss this. I would just add a column sourceAnnotationId (to be consistent with the name) and than indeed have the same order as in the outer list of sources.

We would also need to discuss which format selectedSourceAnnotationId has. I guess timepoint-sourceAnnotationId should be sufficient for uniqueness?!

What do you think, @tischi?

tischi commented 3 years ago
  1. I think it should be source_annotation_id to be consistent with the label_id?
  2. yes, I think timepoint - source_annotation_id should do the job
  3. @K-Meech currently, the table rows MUST be in the same order as the List< List< String > > sources, as the indexing would be used for the matching. We could think about changing this to a Map< String, List< String > > sourceAnnotationIdToSources to make things more explicit and not rely on the ordering. I think I'd like this (even though if an annotation position just contains single source it feels a bit annoying to have to give it another name, which probably would be just the same name as the source name itself).
tischi commented 3 years ago

We could also consider this for the GridTransformer: LinkedHashMap< String, List< String > > gridIdToSources, as this would give each grid position a unique name that we could use if we ever want to refer to a grid position as a "source". I am writing LinkedHashMap here, because the order to the items in the Map would still be important here as they would determine the Grid position if no specific grid position is given (if you just say Map in Java you cannot be sure in which order you will get the keys). For the case of providing specific grid positions we would then also have to change this to Map< String, int[] > gridIdToGrindIndices.

K-Meech commented 3 years ago

@tischi I think it'd be nice to switch to Map<String, List<String>> - I think this is easier to understand + means assembling tables is a lot easier (no row order constraints!)

source_annotation_id sounds good

For the GridTransformer - maybe wait until there is a specific use case for this? I don't think it's super vital for now to change it.

constantinpape commented 3 years ago
  • I think it should be source_annotation_id to be consistent with the label_id?

  • yes, I think timepoint - source_annotation_id should do the job

Ok!

LinkedHashMap< String, List< String > > gridIdToSources

If you prefer a (linked) map for all this (sources in sourceAnnotationDisplay, gridIdToSources, gridIdToGridIndices) we should change this all in one go, because I need to change the spec for the grid view now anyways. Would be fine with me, I don't have any preferences here.

tischi commented 3 years ago

we should change this all in one go

I agree.

Would be fine with me, I don't have any preferences here.

I don't have a very strong preference for the GridTransformer either... However, I think I would vote for making it a Map as well, because then for the HTM we could give names such as wellA01 as gridIds which may make human inspection of the JSON easier.

=> All Maps, please (my opinion).

constantinpape commented 3 years ago

=> All Maps, please (my opinion).

Ok, I will change it for all three then, will ping you when the example is there.

tischi commented 3 years ago

@constantinpape How should we name the fields?

public class SourceAnnotationDisplay extends TableDisplay< AnnotatedIntervalTableRow >
{
    // Serialization
    protected Map< String, List< String > > annotatedSources; // or just "sources"?
    protected List< String > selectedSourceAnnotationIds;
tischi commented 3 years ago

I am thinking now that just sources is probably better.

constantinpape commented 3 years ago

I am thinking now that just sources is probably better.

Yes, let's just keep it as sources

tischi commented 3 years ago

@constantinpape

Would it be OK for you to simply call it selectedAnnotationIds instead of selectedSourceAnnotationIds?

protected Map< String, List< String > > sources; // annotationId to sources
protected List< String > selectedAnnotationIds;
protected Map< TableDataFormat, StorageLocation > tableData;
constantinpape commented 3 years ago

Would it be OK for you to simply call it selectedAnnotationIds instead of selectedSourceAnnotationIds?

Yes, no problem.

I have the changes to the spec almost ready, I only need to create the example data; will do that later.

constantinpape commented 3 years ago

LinkedHashMap< String, List< String > > gridIdToSources

@tischi I just realized that you should not rely on the sorting here! I save the jsons with sort_keys=True, because this makes them more readable. However, this means that the keys of the grid sources will be string sorted and not (as would be correct) int sorted.

constantinpape commented 3 years ago

@tischi here's the example project: https://github.com/mobie/zebrafish-lm-datasets/tree/new-grid-spec

tischi commented 3 years ago

Thanks! I guess the sourceNamesAfterTransform should then also be a Map? I don't think you have it in the example...

tischi commented 3 years ago

@constantinpape the column names in tables for the sourceAnnotationDisplay are still grid_id and membrane; I think there should be columns for annotation_id and timepoint, right? (we originally said source_annotation_id but I think just annotation_id is better...)

constantinpape commented 3 years ago

the column names in tables for the sourceAnnotationDisplay are still grid_id

Yes, I forgot to update the tables. The first column is now called annotation_id; I pushed the change to the branch already.

and timepoint, right?

I am not sure if there should be a timepoint column: for the segmentation tables we also don't have them in the table and you add this column when loading the table in MoBIE.

If possible I would keep the same loging for the source annotation tables; but if you need the timepoint column for technical reasons here I can also add it.

tischi commented 3 years ago

in other words: the timepoint column is optional, right?

constantinpape commented 3 years ago

in other words: the timepoint column is optional, right?

Yes, indeed it's optional and only there if we actually have multiple timepoints. I just checked and that's how we also handle it for the segmentation tables, see https://raw.githubusercontent.com/mobie/arabidopsis-root-lm-datasets/spec-v2/data/arabidopsis-root/tables/lm-cells/default.tsv.

constantinpape commented 3 years ago

I think this one is fairly high priority because we can use it for some of the planned figures. @K-Meech is this something you could look into?

K-Meech commented 3 years ago

Yes - I can take a look :)