Make default table explicit

tischi commented 3 years ago

@constantinpape @K-Meech What do you think about adding the default.tsv explicitly to the tables field of a SegmentationDisplay? This would alleviate the restriction on the table name and in addition the name tables would make more sense, since right now it should rather be additionalTables. I mention this because I also think the Java code would be easier if it simply had to look into the tables field to see which tables to load.

constantinpape commented 3 years ago

Yes, I think that makes sense. It's good to get rid of the implicit assumptions there. I can create an example project later, do you have any preference which one of the examples we have?

tischi commented 3 years ago

Cool! The zebrafish new-data-spec please as I am working on this anyway for the multiple table merging.

constantinpape commented 3 years ago

Ok, I will let you know when it's there.

constantinpape commented 3 years ago

I noticed one more inconsistency while working on this: for the grid view, we only have the tableData field, but not tables to specify the tables being loaded.

This has worked before, because default.tsv by default. But now it won't work, because all tables need to be specified.

I would suggest we also add the tables field here, but then we need to also support loading additional tables for the grid tables. I guess that's not a problem as it uses the same table logic?!

constantinpape commented 3 years ago

Done: https://github.com/mobie/zebrafish-lm-datasets/tree/new-table-spec (with adding to tables in the grid view as described above).

tischi commented 3 years ago

@constantinpape

We should still mention in the spec that the first table in the list is special.

It is special in a sense that (currently) the additional tables may have missing rows, but not additional rows.

In other words, the first table defines which image segments exist.

tischi commented 3 years ago

In addition, the first table MUST contain already all the required columns such as label_id, a.s.o

constantinpape commented 3 years ago

In addition, the first table MUST contain already all the required columns such as label_id, a.s.o

Yes, that is mentioned in the spec already, and I will add it to the python code that validates the tables.

It is special in a sense that (currently) the additional tables may have missing rows, but not additional rows.

Good point, I will add it to the spec and the python validation code.

tischi commented 3 years ago

noticed one more inconsistency while working on this: for the grid view, we only have the tableData field, but not tables to specify the tables being loaded.

I came across the same thing just now. Yes, also here we should be explicit. Could you please change it to the branch? I mean the tables field?
General question: @K-Meech how do we (i.e. the code) know what additional tables, which are not immediately loaded, exist?

tischi commented 3 years ago

I am confused now about the difference between tableData and tables...

constantinpape commented 3 years ago

1. I came across the same thing just now. Yes, also here we should be explicit. Could you please change it to the branch? I mean the `tables` field?

It's there already on the new-table-spec branch, see for example https://github.com/mobie/zebrafish-lm-datasets/blob/new-table-spec/data/membrane/dataset.json#L1841-L1843.

I am confused now about the difference between tableData and tables...

I think that one is actually very clear:

tableData: defines the table data format. Currently we only support tsv, so we have {"tableData": {"tsv":{"relativePath": "path/to/table_folder"}}}, where all tables for this segmentation or grid view are in "path/to/table" relative to the dataset folder.
tables lists the filenames of the tables that should be loaded for a view (either for the segmentation tables, which are listed in "segmentationDisplay:tables" or for the grid view tables, which are listed in "grid:tables"). These filenames must exist in the folder specified in tableData and the tables there must be in the correct format, i.e. have a 'label_id' column for the segmentation tables etc.

tischi commented 3 years ago

I see! I guess this would also fix the other issue that I opened here: https://github.com/mobie/zebrafish-lm-datasets/issues/5 ?!

K-Meech commented 3 years ago

General question: @K-Meech how do we (i.e. the code) know what additional tables, which are not immediately loaded, exist?

Short answer is - we don't! The code only knows about the folder where tables are stored (tableData) - but not which tables are present there (unless they are explicitly loaded in a view). Every time you 'load columns...' etc, it reads all the filenames from that folder, and lets the user pick one of them.

tischi commented 3 years ago

OK! We can see.... The advantage is that like this we don't need to duplicate this information anywhere.

constantinpape commented 3 years ago

@tischi let me know when I should update all example projects to the new table spec.

Also, we have an example for a grid view with additional tables now, see https://github.com/mobie/plankton-fibsem-project/blob/master/data/galdieria/dataset.json#L206-L209. The data is not on s3 yet (waiting for a bucket), but you can find it on the EMCF share at /g/emcf/pape/plankton-fibsem-project.

tischi commented 3 years ago

@constantinpape do we have it in the spec that there MUST be one column called "grid_id"?

constantinpape commented 3 years ago

@constantinpape do we have it in the spec that there MUST be one column called "grid_id"?

Yes, we have.

tischi commented 3 years ago

Works!

Regarding the name of the table: should I just use the name of the view?

tischi commented 3 years ago

Regarding the name of the table: should I just use the name of the view?

I think in theory there could be multiple such tables for one view, so that may not always work.

Does the gridSourceTransform have an optional name field that could be used to name the table?

constantinpape commented 3 years ago

Does the gridSourceTransform have an optional name field that could be used to name the table?

Currently we don't have a name field, but we can add an optional one and then you can use it to name the table (and give it some generic name if the name field is not present).

Also, we have the optional field names in there, https://github.com/mobie/mobie.github.io/blob/master/schema/view.schema.json#L350. Where the idea is that if given the names of the sources will be replaced with it. To avoid confusion I would like to rename it. Maybe sourceNames? or, even more explicit sourceNamesAfterTransform?

tischi commented 3 years ago

Currently we don't have a name field, but we can add an optional one and then you can use it to name the table (and give it some generic name if the name field is not present).

yes, please!

I was stumbling over the same!

sourceNamesAfterTransform is good I think!

tischi commented 3 years ago

I did like this for now:

                    if ( gridSourceTransformer.getName() == null )
                    {
                        gridSourceTransformer.setName( viewName );
                    }

OK?

tischi commented 3 years ago

Somehow in my Java code SourceTransformer has a "name" field already...

constantinpape commented 3 years ago

Somehow in my Java code SourceTransformer has a "name" field already...

Ok, I think we went back and forth if we should have it before. Anyway, I have now added the optional name and sourceNamesAfterTransform to all source transformations. I will create an example with a grid transform that actually uses these fields later so that you can test it.

constantinpape commented 3 years ago

I have updated all the projects (spec-v2 branches) now to the new table spec. I did not find a good example case for the sourcesAfterTransformation yet, I think the covid plate will be good for this, see #56.

tischi commented 3 years ago

Could we call the field sourceNamesAfterTransformation? Otherwise I find it a little confusing...

constantinpape commented 3 years ago

yes, it's called sourceNamesAfterTransformation already, see https://github.com/mobie/mobie.github.io/blob/master/schema/view.schema.json#L227. I was just imprecise in the comment above.

constantinpape commented 3 years ago

There is an example project with crop and sourceNamesAfterTransformations now, but there seem to be some issues: https://github.com/mobie/mobie-viewer-fiji/issues/313. I will close this issue for now.

mobie / mobie.github.io

Make default table explicit #54