Advanced grid views - Githubissues

constantinpape commented 3 years ago

I have started to work on the Covid-IF project, https://github.com/mobie/covid-if-project and added the image data for one plate now. This project uses "ome.zarr" as file format (i.e. without additional xmls); we still need to work on opening this in MoBIE (I will work on this with @KateMoreva).

The next step is to add the grid views to create the plate layout, and I want to clarify a few things before starting to work on this.

Nested grid views: To create the well - plate layout, we need to nest the grid views like this (simplified):

"nuclei": {
  "sourceTransforms": [
    {"grid": {"name": "wellA", "sources": ["nucleiA1", "nucleiA2"]}},
    {"grid": {"name": "wellB", "sources": ["nucleiB1", "nucleiB2"]}},
    {"grid": {"name": "plate", "sources": ["wellA", "wellB"]}}
  ]
}

I think we discussed this already, @tischi, but I want to double check that this is how we intent to use the grid trafo for the plate layout.

Reference other views: The grid views will be very long in the json, but we may want to specify a view that shows multiple channels, e.g. nuclei, serum and marker. For this it would be very helpful to have "meta" views (or composite views, combined views, ...) that just list other views that should all be loaded. Something like this:

"plate-view": {  # this view loads the 3 views specified in the view list (which have to be valid names of views)
  "viewList": ["nuclei", "serum", "marker"]
}

@tischi, @K-Meech do you think we could add something like this to the spec and support it in MoBIE Fiji?

K-Meech commented 3 years ago

@tischi, @K-Meech do you think we could add something like this to the spec and support it in MoBIE Fiji?

Sounds good! But I guess these 'meta' views would also need the fields isExclusive, viewerTransform and uiSelectionGroup (i.e. anything that's in a view but not in a display), otherwise it is unclear which view to take these values form?

constantinpape commented 3 years ago

Sounds good!

Awesome!

But I guess these 'meta' views would also need the fields isExclusive, viewerTransform and uiSelectionGroup (i.e. anything that's in a view but not in a display), otherwise it is unclear which view to take these values form?

Yes, that's a good point, in more detail:

isExclusive: yes, that is required. The question here is what to do if one of the views listed has isExclusive: true. This could either be ignored or result in a runtime error. I think I would prefer that it errors.
viewerTransform: here the question is what to do if the listed views have viewerTransforms. They could either be ignored or applied in order of listed views (i.e. the last one wins). If the "meta"-view (still open for a better name;)) has a viewerTransform it over-rides the earlier ones.
uiSelectionGroup: yes, that is required and the values for the other views are ignored.

constantinpape commented 3 years ago

Working on this more, I am not quite sure how to handle the grid tables here: we now need tables both for the well grids ("wellA" and "wellB" in the example above) and for the plate grid ("plate"). The rows of the well tables would correspond to images, the rows of the plate table to wells. It's easy for me to do this, but I am not sure how the viewer will handle it, @tischi. (Also in principle we allow for arbitrary many levels of nesting rn, which would complicate things even further.)

I also moved the discussion about meta views to #59, so that we can focus on the issues with complex grid views here.

constantinpape commented 3 years ago

I have added single source grid views like we discussed today for the covid dataset now to https://raw.githubusercontent.com/mobie/covid-if-project/master/data/20200406_164555_328/dataset.json. Just to recap:

Each well is a grid view and has its own table for the images of this well
The plate corresponds to a grid view where each position lists the images of the corresponding well
The plate view has a table where the columns correspond to wells

I think this will be enough to stress-test the grid view functionality, @tischi ;). But before we need to enable opening projects with ome.zarr data format.

tischi commented 3 years ago

Great! So, I'll be waiting for @KateMoreva for green-lights.

constantinpape commented 3 years ago

@tischi I can open the covid-if project thanks to the new mobie-beta release now. However, trying to open one of the grid views does not do much; the log window shows that something is loaded but no images are displayed:

Screenshot from 2021-07-22 21-09-53

To reproduce:

Open /g/kreshuk/pape/Work/mobie/covid-if-project with recent mobie-beta or master
Select one of the grid views from "images", e.g. "nuclei".

tischi commented 3 years ago

@constantinpape Is that something that I should test or does it have to do with one of the other ome-zarr image loader relates issues?

constantinpape commented 3 years ago

@constantinpape Is that something that I should test or does it have to do with one of the other ome-zarr image loader relates issues?

I don't think it's an issue with the loader, but rather with the (nested) grid view so it would be good if you can test it.

tischi commented 3 years ago

OK! Would it be a lot of work to make a minimal example bookmark, e.g with only two wells?

tischi commented 3 years ago

@KateMoreva Opening above dataset there is a lot of output in the console:

[shape, chunks, fill_value, dtype, filters, compressor, order, zarr_format]
No scale givenjava.lang.NullPointerException
[shape, chunks, fill_value, dtype, filters, compressor, order, zarr_format]
[shape, chunks, fill_value, dtype, filters, compressor, order, zarr_format]
[shape, chunks, fill_value, dtype, filters, compressor, order, zarr_format]
[shape, chunks, fill_value, dtype, filters, compressor, order, zarr_format]
[shape, chunks, fill_value, dtype, filters, compressor, order, zarr_format]
No scale givenjava.lang.NullPointerException
[shape, chunks, fill_value, dtype, filters, compressor, order, zarr_format]
[shape, chunks, fill_value, dtype, filters, compressor, order, zarr_format]
[shape, chunks, fill_value, dtype, filters, compressor, order, zarr_format]
[shape, chunks, fill_value, dtype, filters, compressor, order, zarr_format]
No scale givenjava.lang.NullPointerException
No scale givenjava.lang.NullPointerException

Could you please have a look:

what the NullPointerException is about?
whether we could disable the output of the [shape, chunks, fill_value, dtype, filters, compressor, order, zarr_format] strings?

KateMoreva commented 3 years ago

@KateMoreva Opening above dataset there is a lot of output in the console:

[shape, chunks, fill_value, dtype, filters, compressor, order, zarr_format]
No scale givenjava.lang.NullPointerException
[shape, chunks, fill_value, dtype, filters, compressor, order, zarr_format]
[shape, chunks, fill_value, dtype, filters, compressor, order, zarr_format]
[shape, chunks, fill_value, dtype, filters, compressor, order, zarr_format]
[shape, chunks, fill_value, dtype, filters, compressor, order, zarr_format]
[shape, chunks, fill_value, dtype, filters, compressor, order, zarr_format]
No scale givenjava.lang.NullPointerException
[shape, chunks, fill_value, dtype, filters, compressor, order, zarr_format]
[shape, chunks, fill_value, dtype, filters, compressor, order, zarr_format]
[shape, chunks, fill_value, dtype, filters, compressor, order, zarr_format]
[shape, chunks, fill_value, dtype, filters, compressor, order, zarr_format]
No scale givenjava.lang.NullPointerException
No scale givenjava.lang.NullPointerException

Could you please have a look:

what the NullPointerException is about?
whether we could disable the output of the [shape, chunks, fill_value, dtype, filters, compressor, order, zarr_format] strings?

Sure. Point 2 is solved in https://github.com/mobie/mobie-viewer-fiji/pull/377. Thinking about problem in point 1.

constantinpape commented 3 years ago

@tischi @KateMoreva I have added a smaller grid view for testing now, it's images/test. (And I still run into the same issues when trying to open it.)

tischi commented 3 years ago

Thinking about problem in point 1.

Hi @KateMoreva, any progress about the null pointer exception? I can also look into it.

tischi commented 3 years ago

@constantinpape I am looking into it. Right now, the images/test bookmark is invalid:

Fetched 18 image source(s) in 3805 ms, using 8 thread(s).
Exception in thread "Thread-9" java.lang.UnsupportedOperationException: None of the sources specified at the first grid position could be found at the list of the sources that are to be transformed. Names of sources at first grid position: [nuclei_WellE06_0000]
    at de.embl.cba.mobie.transform.GridSourceTransformer.getReferenceSource(GridSourceTransformer.java:122)
    at de.embl.cba.mobie.transform.GridSourceTransformer.transform(GridSourceTransformer.java:40)

I can try to fix it myself but then I would need write access to the dataset.json file 😃

tischi commented 3 years ago

...I actually think the bookmark may be fine. Looking for the culprit...

tischi commented 3 years ago

@KateMoreva

The issue seems to be that the names of the sources are not correct (as in the other issue):

Here the source name is just data.

@constantinpape Are the source names are currently actually specified within the ome-zarr? Or do you we have to (currently) inject it from the dataset.json?

tischi commented 3 years ago

@constantinpape Maybe the issue is that the names within the zarr attributes is not set correctly?

serumIgG_WellH12_0008.ome.zarr (master) $ cat .zattrs 
{"multiscales":[{"axes":["y","x"],"datasets":[{"path":"s0"},{"path":"s1"},{"path":"s2"},{"path":"s3"}],
"name":"data","version":"0.3"}]}

tischi commented 3 years ago

👆 right now the name only is "data".

constantinpape commented 3 years ago

Ok, I will fix it; but I am not sure if this is even used by the OmeZarrReader right now to set the source name. Do you know, @KateMoreva?

constantinpape commented 3 years ago

It's fixed now:

$ cat data/20200406_164555_328/images/ome-zarr/marker_tophat_WellA01_0002.ome.zarr/.zattrs 
{"multiscales":[{"axes":["y","x"],"datasets":[{"path":"s0"},{"path":"s1"},{"path":"s2"},{"path":"s3"}],"name":"marker_tophat_WellA01_0002","version":"0.3"}]}

KateMoreva commented 3 years ago

Now the files have their correct names.

But instead of opening one file( ex: nuclei_WellA01_0002.ome.zarr for marker_tophat_WellA01_0002.ome.zarr) it is trying to read all nuclei_ files in the folder. And right now, I was able to open all of them, but it took a huge amount of time.

constantinpape commented 3 years ago

Now the files have their correct names.

:+1:

But instead of opening one file( ex: nuclei_WellA01_0002.ome.zarr for marker_tophat_WellA01_0002.ome.zarr) it is trying to read all nuclei_ files in the folder.

Yes, that is expected.

And right now, I was able to open all of them, but it took a huge amount of time.

Ok, that's good. This sounds like we need to work on the efficiency here. @tischi for reference, all this works fairly smooth in the plateviewer, so I don't see a fundamental reason for the bad performance here. Note that there is the smaller grid view images/test that only opens a subset of images (2 wells = 18 images).

KateMoreva commented 3 years ago

Now the files have their correct names.

But instead of opening one file( ex: nuclei_WellA01_0002.ome.zarr for marker_tophat_WellA01_0002.ome.zarr) it is trying to read all nuclei_ files in the folder.

Yes, that is expected.

And right now, I was able to open all of them, but it took a huge amount of time.

Ok, that's good. This sounds like we need to work on the efficiency here. @tischi for reference, all this works fairly smooth in the plateviewer, so I don't see a fundamental reason for the bad performance here. Note that there is the smaller grid view images/test that only opens a subset of images (2 wells = 18 images).

Maybe it was slight overloading of my machine, but it took almost 5 minutes to open all files.

tischi commented 3 years ago

But instead of opening one file (ex: nuclei_WellA01_0002.ome.zarr for marker_tophat_WellA010002.ome.zarr) it is trying to read all nuclei files in the folder.

Could you elaborate what that means and why this is expected?

tischi commented 3 years ago

OK, the test is opening, but not quite correct?! I guess there should be 2 x 3x3 images, isn't it?

KateMoreva commented 3 years ago

But instead of opening one file (ex: nuclei_WellA01_0002.ome.zarr for marker_tophat_WellA010002.ome.zarr) it is trying to read all nuclei files in the folder.

Could you elaborate what that means and why this is expected?

But instead of opening one file (ex: nuclei_WellA01_0002.ome.zarr for marker_tophat_WellA010002.ome.zarr) it is trying to read all nuclei files in the folder.

Could you elaborate what that means and why this is expected?

Because it is s bit unclear how many files we are going to open. In the UI, we have a section that is called images, but, by default, we see only one image. Thus a user who is not familiar with the dataset may think that images is plural because we have several possible images in the dropdown, but not that there are lots of mages behind variant nuclei.

tischi commented 3 years ago

Ok, I understand! That can be fixed by choosing better names for the bookmarks. Maybe "nuclei whole plate".

Anyway, it is working 🥳

Loading those two wells from home (very crappy internet) was quite decent as it only took a couple of seconds.

It is however expected that this may perform less good as the plateViewer, because now we are loading hundreds of sources instead into BDV instead of just one: in the plateViewer I wrapped all images of one channel into one Source, whereas here they are all added individually. Once we have something more stable I will send an example to Tobias for further optimization.

I guess the question now is what to do with all the tables... I will have a look whether I can combine the image tables into one table.

tischi commented 3 years ago

@constantinpape

    "test": {
      "isExclusive": false,   // <-- should be true?!
      "sourceDisplays": [
        {

constantinpape commented 3 years ago

Ok, I understand! That can be fixed by choosing better names for the bookmarks. Maybe "nuclei whole plate".

Yes, all of this work in progress; I will choose better names for the menus etc. Also, the default view should load the whole plate.

Anyway, it is working partying_face

Awesome.

It is however expected that this may perform less good as the plateViewer, because now we are loading hundreds of sources instead into BDV instead of just one: in the plateViewer I wrapped all images of one channel into one Source, whereas here they are all added individually. Once we have something more stable I will send an example to Tobias for further optimization.

Sounds good.

I guess the question now is what to do with all the tables... I will have a look whether I can combine the image tables into one table.

Yes, I think it would be good to combine all the image tables.

"test": {
  "isExclusive": false,   // <-- should be true?!
  "sourceDisplays": [
    {

I am not so sure here; this depends on how we want to handle switching on and off layers (e.g. all nuclei images). Do we do this by having individual grid views for all the layers or one grid view with all layers, where they can then be sub-selected?

tischi commented 3 years ago

@constantinpape

I am not so sure here; this depends on how we want to handle switching on and off layers (e.g. all nuclei images).

Good point, I did not think about that this is only one layer. Could you please make another test bookmark with another layer/channel for those two wells?

Regarding the tables

I think the easiest would be to save one table that contains the information for all the images and use this for just one per-image sourceAnnotationDisplay. I am also not sure this would have a big disadvantage?! Why would we need the tables of all the images in the individual wells? Maybe one use-case could be to only show a couple of wells in a specific bookmark. In fact one idea for the publication I had was to create a bookmark that only shows the positive and negative control wells! Now one could argue that it is annoying to always have to create a new per-image table for each bookmark where different wells are shown. I am thinking the following amendment to our spec: maybe we _only show table rows where the annotationid occurs in the sources map? Then one could reuse the big per-image table for all kind of bookmarks.

constantinpape commented 3 years ago

Good point, I did not think about that this is only one layer. Could you please make another test bookmark with another layer/channel for those two wells?

Sure, will do.

I am thinking the following amendment to our spec: maybe we _only show table rows where the annotationid occurs in the sources map? Then one could reuse the big per-image table for all kind of bookmarks.

Yes, I think that's a good idea. Should I change it in the covid-if project already?

tischi commented 3 years ago

Yes, I think that's a good idea. Should I change it in the covid-if project already?

Yes, let's go for it! Please ping me once you have such a bookmark and per-image table. I will change the code such that I can test it.

constantinpape commented 3 years ago

@tischi Done:

all wells use the same large table now
there are two bookmarks test-nuclei (2 wells with only nucleus channel) and test-full (2 wells with all 3 chanels)

tischi commented 3 years ago

@constantinpape

$ head /g/kreshuk/pape/Work/mobie/covid-if-project/data/20200406_164555_328/tables/wells/default.tsv
grid_id well    score   is_outlier      n_outlier_cells n_infected_cells        n_non_infected_cells    serum_background_intensity      serum_background_variance
0       A01     nan     1       12      11      603     3571.913        127.71118
1       A01     nan     1       20      7       627     3535.7656       126.48584

I think there MUST be a column called annotation_id?!

constantinpape commented 3 years ago

I think there MUST be a column called annotation_id?!

Indeed. I fixed it.

tischi commented 3 years ago

I think with the new spec when re-using the table in different bookmarks, it is probably a good idea to not just number the annotation_id but give it some meaningful name, like the well or image name?!

Because the numbering will only make sense for one specific view, isn't it?

tischi commented 3 years ago

@constantinpape There seem to be missing values in the first row of the table:

$ head /g/kreshuk/pape/Work/mobie/covid-if-project/data/20200406_164555_328/tables/plate/default.tsv
annotation_id   well    score   is_outlier      n_outlier_cells n_infected_cells        n_non_infected_cells    serum_background_intensity      serum_background_variance
0       A01             1                               4033.7056       162.06592
1       A02     1.049083519801452       0       85.0    3448.0  1068.0  3764.3457       144.84985
2       A03     1.5888386478982377      0       68.0    3608.0  1300.0  3925.8657       146.98267
3       A04     1.8062968151951488      0       107.0   3770.0  1185.0  3941.1394       164.29175

I am not sure, but I think we do not support empty strings but require something like NaN?

constantinpape commented 3 years ago

I think with the new spec when re-using the table in different bookmarks, it is probably a good idea to not just number the annotation_id but give it some meaningful name, like the well or image name?!

Because the numbering will only make sense for one specific view, isn't it?

Yes, that's true. The question is then if we even need the annotation_id then? If we only index subsets of the rows I don't see how using an integer helps us.

Maybe we should rename it to annotation_name and accept that it is always a string?

constantinpape commented 3 years ago

I am not sure, but I think we do not support empty strings but require something like NaN?

Hmm, that looks like pandas is doing something weird. Let me double check.

tischi commented 3 years ago

The question is then if we even need the annotation_id then? If we only index subsets of the rows I don't see how using an integer helps us.

We need something to find out which table row belongs to which sources:

          "sourceAnnotationDisplay": {
            "lut": "glasbey",
            "name": "E07",
            "opacity": 0.5,
            "sources": {
              "0": [
                "nuclei_WellE07_0000"
              ],
              "1": [
                "nuclei_WellE07_0001"
              ],
              "2": [
                "nuclei_WellE07_0002"
              ],
              "3": [
                "nuclei_WellE07_0003"
              ],
              "4": [
                "nuclei_WellE07_0004"
              ],
              "5": [
                "nuclei_WellE07_0005"
              ],
              "6": [
                "nuclei_WellE07_0006"
              ],
              "7": [
                "nuclei_WellE07_0007"
              ],
              "8": [
                "nuclei_WellE07_0008"
              ]
            },
            "tableData": {
              "tsv": {
                "relativePath": "tables/wells"
              }
            },

Right now we are using the key of the sources map for this. Thus this key must be findable in the table...

Hmm, that looks like pandas is doing something weird. Let me double check.

I just checked, the error is not thrown because of the "" cells, but in some other row, which probably really has less than 9 entries. If you don't find the error I can find out which row it is...

constantinpape commented 3 years ago

I think I figured out the issue with the table and it's because of the nan's and pandas, by default, writing them out empty. I know how to fix it, but I need to specify what value to set. What can you pass on the java side there? nan, NaN, something else?

tischi commented 3 years ago

I still think there should be one line with less than 9 items after the .split("\t")?!

Regarding the parsing

Double

public static Double parseDouble( String cell )
    {
        if ( cell.toLowerCase().equals( "nan" ) || cell.equals( "" ) )
            return Double.NaN;
        else if ( cell.toLowerCase().equals( "inf" ) )
            return Double.POSITIVE_INFINITY;
        else if ( cell.toLowerCase().equals( "-inf" ) )
            return Double.NEGATIVE_INFINITY;
        else
            return Double.parseDouble( cell );
    }

String

We use None for nothing.

tischi commented 3 years ago

☝️ in fact it looks like I am parsing "" as Double.NaN, so could you please check if there maybe is one row with less than nine items?

constantinpape commented 3 years ago

point_up in fact it looks like I am parsing "" as Double.NaN, so could you please check if there maybe is one row with less than nine items?

I think that something is going wrong in your code if there are multiple nans that are mapped to "". I replaced all of them with nan now, can you please check again? (writing less than nine items should not be possible with pandas, because it writes a proper table).

tischi commented 3 years ago

It works now, but it opens still two tables for the images.... Did you update the bookmark?

constantinpape commented 3 years ago

It works now

Ok!

, but it opens still two tables for the images.... Did you update the bookmark?

I am working on it; in order to also fix the issue with the correct keys for the annotation_id now. I will ping you when I think it should work.

constantinpape commented 3 years ago

@tischi I think everything should be correct now. I am now using the site name <WELL_NAME>-000X as annotation_id for the images and the well name for the wells. Please give it another try.

tischi commented 3 years ago

@constantinpape

It works!

And I think we should now combine the two image based sourceAnnotationDisplays into one, ok?

constantinpape commented 3 years ago

And I think we should now combine the two image based sourceAnnotationDisplays into one, ok?

What do you mean by this?

tischi commented 3 years ago

Right now we have:

       "sourceAnnotationDisplay": {
            "lut": "glasbey",
            "name": "E06",
            "opacity": 0.5,
            "sources": {
              "E06-0000": [
                "nuclei_WellE06_0000"
              ],
              "E06-0001": [
                "nuclei_WellE06_0001"
              ],
              "E06-0002": [
                "nuclei_WellE06_0002"
              ],
              "E06-0003": [
                "nuclei_WellE06_0003"
              ],
              "E06-0004": [
                "nuclei_WellE06_0004"
              ],
              "E06-0005": [
                "nuclei_WellE06_0005"
              ],
              "E06-0006": [
                "nuclei_WellE06_0006"
              ],
              "E06-0007": [
                "nuclei_WellE06_0007"
              ],
              "E06-0008": [
                "nuclei_WellE06_0008"
              ]
            },
            "tableData": {
              "tsv": {
                "relativePath": "tables/wells"
              }
            },
            "tables": [
              "default.tsv"

and

         "sourceAnnotationDisplay": {
            "lut": "glasbey",
            "name": "E07",
            "opacity": 0.5,
            "sources": {
              "E07-0000": [
                "nuclei_WellE07_0000"
              ],
              "E07-0001": [
                "nuclei_WellE07_0001"
              ],
              "E07-0002": [
                "nuclei_WellE07_0002"
              ],
              "E07-0003": [
                "nuclei_WellE07_0003"
              ],
              "E07-0004": [
                "nuclei_WellE07_0004"
              ],
              "E07-0005": [
                "nuclei_WellE07_0005"
              ],
              "E07-0006": [
                "nuclei_WellE07_0006"
              ],
              "E07-0007": [
                "nuclei_WellE07_0007"
              ],
              "E07-0008": [
                "nuclei_WellE07_0008"
              ]
            },
            "tableData": {
              "tsv": {
                "relativePath": "tables/wells"

I think we just combine them into one such that there is only one per-image (=site) table.

I also think there is a misnomer:

"relativePath": "tables/wells" === should be ===> "relativePath": "tables/sites" "relativePath": "tables/plate" === should be ===> "relativePath": "tables/wells"

Note: whenever I said "per-image" I meant "site".

mobie / mobie.github.io

Advanced grid views #58

Regarding the tables

Double

String