Settle axes discussion for now

bioimage-io / spec-bioimage-io

Specification for the bioimage.io model description file.

https://bioimage-io.github.io/spec-bioimage-io/

MIT License

18 stars 17 forks source link

Settle axes discussion for now #70

Closed FynnBe closed 7 months ago

FynnBe commented 3 years ago

Polling does not seem to work in the PR... Maybe here we can vote on the proposal in https://github.com/bioimage-io/spec-bioimage-io/pull/69#issuecomment-822457618 and discuss any changes of this proposal specifically. Please open separate issues for completely different proposals.

Proposal:

We do add a few (pseudo) dimensions (i, f, d) to allow for straight forward description of almost all i/o data, which serves as a more detailed visualization hint (this list of c like axes may be extended in the future, but we will try to keep it short).

We agree not to duplicate any axis letter (chances are those dimensions can very well be flattened, which makes the model more accessible to a wider range of consumer software).

We emphasize the close relationship of channel like (pseudo) dimensions in the documentation to encourage model devs to limit their use to one of them, which allows to map those to the c dimension (e.g. in software that can 'only' handle 6d data) and 'keep it simple', while allowing higher dimensional models (max 9d)

As this turned out more complex than initially assumed, we add a key axes_description with a list of human readable short (single line + char limit) descriptions (maybe nested for 'columns'), that consumer software could potentially display. (alo: maybe fix description for b). Details are left for further discussion.

Left out for further discussion: Add keys like axes_unit/axes_resolution to further clarify the nature of the i/o data.

edited: poll options were too long and cut off

k-dominik commented 3 years ago

I guess you don't want comments here again?

FynnBe commented 3 years ago

I guess you don't want comments here again?

no, of course comments are fine (I updated the poll, its options were cut off). Anything going beyond the scope of comments and minor changes should go to another dedicated issue, though.

constantinpape commented 3 years ago

I don't think we should allow for >6d (i.e. btczyx) data. Otherwise we need to start to reason about flattening on the spec level, which just introduces unnecessary complexity. Further, if we introduce the additional axes_description I don't really see the need for anything but c. The difference between c, i, f would be purely semantic anyways, so one might as well check the axes description for it.

oeway commented 3 years ago

Thanks @FynnBe for the summary, I am fine with the proposal and I am thinking an alternative solution to add c, i, f is to make the axes key optional non-image tensor input or output.

It totally make sense for image tensor to stay with btczyx, but it becomes problematic if we force non-image tensor to use these 6 types of axes.

@constantinpape Let me give you some examples where c won't be enough. In a single molecule localization model which takes an image as input and produce a list of xyz coordinates (e.g. deeploco), in this case you can use c for the the three coordinate, but for along the location instances, we won't find a letter from btczyz that fit well. We will likely need something like i and define axes=bic. Or we rather not define the axes, but let the consumer software to handle it. @FynnBe 's argument for enforcing a common standard for axes to enable interoperability for consumer software, however, I think it won't work for all cases. At one point, we will have to accept that fact that some models will only work with a very specifically designed consumer software. In this case, I think it won't make sense to force the developer to set, for example, axes=bxy so other software can also read it and use an image to represent the xyz coordinates.

There are other cases which c won't be enough, for example I am working on a model which takes an image and a protein sequence as input. For the protein sequence input, we use a 1-hot encoding to encode the 20 amino acid, so a sequence will be come a 2dimensional table. In that case, we will likely need to use c for the encoding and i for the instance of each amino acid letter.

More generally, for non-image data which has more than 3 channels, we will have the situation that b, c, i won't be enough and we cannot use t, x, y, z.

FynnBe commented 3 years ago

I don't think we should allow for >6d (i.e. btczyx) data. Otherwise we need to start to reason about flattening on the spec level, which just introduces unnecessary complexity. Further, if we introduce the additional axes_description I don't really see the need for anything but c. The difference between c, i, f would be purely semantic anyways, so one might as well check the axes description for it.

some options following this only-c-idea for tabular data like the 'big bug table' from https://github.com/bioimage-io/spec-bioimage-io/pull/69#issuecomment-822457618 you would recommend to (ab)use t,z,y or x? I suppose axes_description would make it clear and for visualization we could:

agree to interpret a single spatial dimensions as specific visualization hints, e.g. lonely x: = i (index). 1d image -> table!

# example with big table 
outputs:
- name: bug_table    # table with individual bugs
axes: bxc  # single `x` means index (-> table axes may not contain `y` or `z`)
shape: [1, 100, 2]  # btw, we don't allow for an unknown size (yet), which would be important for tables, there are not 100 bugs on every page of code after all.
axes_description: ["batch", "bug", ["size", "age"]]

make visualization hints more explicit and add axes_type or similar(, where only certain keys are allowed for certain axes keys, e.g. b in ["batch"], x in ["x-axis", "index"], c in ["rgb", "image channel", "feature"]).

 # example with big table 
outputs:
  - name: bug_table    # table with individual bugs
    axes: bxc   
    axes_type: ["batch", "index", "feature"]
    shape: [1, 100, 2]  # btw, we don't allow for an unknown size (yet), which would be important for tables, there are not 100 bugs on every page of code after all.
    axes_description: ["batch", "bug", ["size", "age"]]

channel_type instead of axes_type: like 2, but only c can have multiple meanings, but c may occur more than once.
not allow for wide tables and limit it to a fixed (small) number of columns, only using the # example with fixed columns/ small table from https://github.com/bioimage-io/spec-bioimage-io/pull/69#issuecomment-822457618:

outputs:
  # table with individual bugs
  - name: bug_size  # first column
    axes: bf
    shape: [1, 100]    
    ...
  - name: bug_age  # second column
    axes: bf
    shape: [1, 100] 
    ...

FynnBe commented 3 years ago

@constantinpape Let me give you some examples where c won't be enough. In a single molecule localization model which takes an image as input and produce a list of xyz coordinates (e.g. deeploco), in this case you can use c for the the three coordinate, but for along the location instances, we won't find a letter from btczyz that fit well. We will likely need something like i and define axes=bic. Or we rather not define the axes, but let the consumer software to handle it. @FynnBe 's argument for enforcing a common standard for axes to enable interoperability for consumer software, however, I think it won't work for all cases. At one point, we will have to accept that fact that some models will only work with a very specifically designed consumer software. In this case, I think it won't make sense to force the developer to set, for example, axes=bxy so other software can also read it and use an image to represent the xyz coordinates.

There are other cases which c won't be enough, for example I am working on a model which takes an image and a protein sequence as input. For the protein sequence input, we use a 1-hot encoding to encode the 20 amino acid, so a sequence will be come a 2dimensional table. In that case, we will likely need to use c for the encoding and i for the instance of each amino acid letter.

both of these cases boil down to a table. I see that for 20 columns the above option 4. is already too tedious and unintuitive, but options 1-3 could match your use cases here.

More generally, for non-image data which has more than 3 channels, we will have the situation that b, c, i won't be enough and we cannot use t, x, y, z.

I see that we need a bit more flexibility than the hard btczyx image tensor, but maybe it's best to restrict ourselves to the minimal extension that we really need right now and wait for a concrete use case to expand our definitions further. I am sure the ngff discussions will bring more insights over time, so let's aim at an intermediate solution that works for all models currently in and in-line for the model zoo.

constantinpape commented 3 years ago

@constantinpape Let me give you some examples where c won't be enough. In a single molecule localization model which takes an image as input and produce a list of xyz coordinates (e.g. deeploco), in this case you can use c for the the three coordinate, but for along the location instances, we won't find a letter from btczyz that fit well. We will likely need something like i and define axes=bic. Or we rather not define the axes, but let the consumer software to handle it. @FynnBe 's argument for enforcing a common standard for axes to enable interoperability for consumer software, however, I think it won't work for all cases. At one point, we will have to accept that fact that some models will only work with a very specifically designed consumer software. In this case, I think it won't make sense to force the developer to set, for example, axes=bxy so other software can also read it and use an image to represent the xyz coordinates.

Ok, I understand the use-case for i now. It basically means that you have a list index for a dimension where the size of this dimension might not be known a priori, e.g. because you can have different number of instances in an image. I agree that it makes sense to introduce this as a different letter, because it is semantically very different from a channel (where you know exactly the size of the channel dimension). However, note that this implies that b might not be naively applicable, because you might have different numbers of instances per sample, so you get a ragged array instead of a tensor. Could be solved by padding or restricting to b=1.

More generally, for non-image data which has more than 3 channels, we will have the situation that b, c, i won't be enough and we cannot use t, x, y, z.

I disagree here though: there is nothing fundamentally different of stacking 3 or 20 channels. For displaying purposes, channels would be treated as a stack by default (i.e. each channel is shown individually). For other display modes, e.g. treating 3 channels as rgb, hints could be stored in the axes_descriptors. There are so many different use cases here that it doesn't make much sense to standardize it further; hence text in axes_descriptors are not restricted and a consumer can use it to store its custom visualisation syntax if necessary. This can cover a wide variety of use-cases, e.g. also multi-dimensional axes can be flattened into c without the need to formalize all this in the spec.

Again, I see the need for i now, because it covers the use-case where data cannot be simply stacked.

tomburke-rse commented 3 years ago

I support @FynnBe's last comment. I think each of our software solutions can (and should) support the extra axes in the future for non-image data to visualize for the user. IF we encounter models that need even more dimensions, we can do one of these things:

Say that we only officially support 9D, but you can add more dimensions which may not be supported by all software solutions
Live with the fact that we will have models that work only in one (or none?) of our software because it was tailored for that.
Make it a hard constraint, declining all models >9D

But this is again a discussion we can do when we encounter such models. Now we have a clear example thanks to Wei's work.

constantinpape commented 3 years ago

Ok, there is some simultaneous posting going on :D.

To summarize my stance: I would vote for option 1 in the comment from @FynnBe (https://github.com/bioimage-io/spec-bioimage-io/issues/70#issuecomment-823152000), but I do agree with @oeway that we should introduce a new axis letter i. This is semantically different from c, because it describes "list-like" dimensions, for which the size of the dimension might not be known a priori.

Similar to @tomburke-rse, I think we should only introduce further letters if a use case that does not fit this spec at all (i.e. that is impossible or extremely inconvenient to express. It's totally fine if custom use cases need a bit of massaging to fit the spec, because otherwise we will need to change it for every new use-case.)

Edit: Also, I hope that in the long-term OME-NGFF will define a more general standard for image metadata that we will be able to use. This will likely still take some time, and the process will involve many stakeholders. So we need a workable spec that fits our needs, but we should not try to come up with a general purpose image metadata spec independently.

oeway commented 3 years ago

Hi @constantinpape and @FynnBe I think we need to clarify the difference with different channels vs different axes a bit more. Considering the following example for the single molecule localisation or tracking model where it produce a list of xyz coordinates.	x	y
1.2	100	200
2.0	2.9	100
100	200	300

For the horizontal axis, we may use the letter c or f, and for the vertical dimension, we are likely need i instead of one of bctxyz, because the semantical meaning is about each individual molecule (not because its size is undefined).

In this case, we will define the output tensor as a 3D axes as bic (deeploco for example allows predicting 256 molecules maximum, so the i axes is padded to have length 256)

I disagree here though: there is nothing fundamentally different of stacking 3 or 20 channels. For displaying purposes, channels would be treated as a stack by default (i.e. each channel is shown individually).

Well, I think it won't work unless we force everyone to flatten axes that does not fall into our defined axes. For example, in the above example, if we now say the model not only output a table of x, y, z coordinate, but it will output an additional table that contains the uncertainty scores for each predicted numbers in the first table. Together with the batch dimension, we will now have a 4 dimensional output bic plus another channel-like axis with length 2 (the first one is real coordinates, and the second one is error), the what letter should we use for the 4th dimension? We cannot use c again because we do not allow duplication @FynnBe @constantinpape

Similar to @tomburke-rse, I think we should only introduce further letters if a use case that does not fit this spec at all (i.e. that is impossible or extremely inconvenient to express. It's totally fine if custom use cases need a bit of massaging to fit the spec, because otherwise we will need to change it for every new use-case.)

I agree with introducing more letters later, but I am thinking that we should provide a more gentle way for those future letters to land. I mean if we do not allow custom axes letter or leave the axes key optional, models with those axes cannot enter the model zoo (because they won't pass the validator). I think we need a way to allow those models to enter the model zoo first, then offer the chance to discuss and absorb the new axis definition.

Also, I hope that in the long-term OME-NGFF will define a more general standard for image metadata that we will be able to use.

That's good! However, I think because OME-NGFF is a image format, there won't be much non-image axes involved, no?

constantinpape commented 3 years ago

In this case, we will define the output tensor as a 3D axes as bic (deeploco for example allows predicting 256 molecules maximum, so the i axes is padded to have length 256)

I think bic is a good solution here. If you can pad here because you know the maximal size of i in this case this is good; but there might very well be cases for i were this is not possible. E.g. a model that can do differentiable instance segmentation and returns properties for each instance in an image. But we don't need to discuss that in too much detail now; I just wanted to mention that this is a potential concern for the i dimension.

Well, I think it won't work unless we force everyone to flatten axes that does not fall into our defined axes. For example, in the above example, if we now say the model not only output a table of x, y, z coordinate, but it will output an additional table that contains the uncertainty scores for each predicted numbers in the first table.

You can either flatten OR return a second tensor for the uncertainty scores. So I think this fits in the current proposal without big issues. I want to stress this again: It's totally fine if custom use cases need a bit of massaging to fit the spec, because otherwise we will need to change it for every new use-case.

I agree with introducing more letters later, but I am thinking that we should provide a more gentle way for those future letters to land. I mean if we do not allow custom axes letter or leave the axes key optional, models with those axes cannot enter the model zoo (because they won't pass the validator). I think we need a way to allow those models to enter the model zoo first, then offer the chance to discuss and absorb the new axis definition.

I think we should avoid adding new letters as much as possible and only do this if there is really no other way. Hence, I think it's a better strategy to discourage this and encourage developers to use the current specification.

That's good! However, I think because OME-NGFF is a image format, there won't be much non-image axes involved, no?

There are already several use-cases under discussion that are similar to the issues here. But yes, it's primarily an image data format. But we are also called bioimage.io ;).

Sorry for being so strict here, but I think the whole discussion revolves around one of the core issues I have with the model zoo direction: we are still in an early beta stage and don't have any significant outside user adoption. That's fine, these things take time. But our priority should be making it work for the current scope of serving image data related models. And I think if we start to introduce very significant changes to the spec now to accomodate more custom use-cases we will either be back to 2 months or more of spec discussions or introduce changes that we don't really understand and that are not described properly. I am not arguing for completely freezing the spec as is and not thinking about custom use-cases at all. But I think right now the focus should be much more on stability.

oeway commented 3 years ago

@FynnBe Thanks for listing the options for axis_description. I am thinking can we make it a list of dictionary to contain more information about the axis? Basically extend your option 1 with a time dimension:

# example with bug table 
outputs:
 - name: bug_table    # table with individual bugs
   axes: btic 
   shape: [1, null, 2] # use null if we don't know the length
   axes_description:
       - name: "batch"
         type: "ordinal"
       - name: "time"
         description: "frame number in the movie"
         type: "ordinal"
         step: 3   #  for ordinal axis, we can define the step/interval, for `x`, `y`, it can be pixel size
         unit: "second" # the unit of time, for `x`, `y`, it can be `pixels` or `um`
       - name: "index"
         description: "index of the bugs"
         type: "ordinal"
         step: 1
       - name: "features"
         description: "features for each detected bug, including size and age"
         type: "nominal"
         labels: ["size", "age"]
         units: ["meter", "year"] # we have really big and old bugs!

In the above example, here are the changes:

the x to i because I think we should not change the meaning of x since they are completely different.
axes_description becomes a dictionary where we can add more meta data
each axis should have a type, either ordinal (e.g. btxyz) or nominal (e.g. c).
each ordinal axis can set step and unit
each nominal axis can set labels for its column name and units for the unit for each column

These are inspired by Vega which I recommend to take a look. It defines very thorough grammar for interactive data visualization. It is highly relevant for us especially for non-image data. We can even directly connect our tensor directly to vega and allow interactive visualisation of tensors. For axes, they have a table of properties which we can take inspirations from.

Another idea is that if we have the axes_description field, shouldn't we simply absorb the single letter axes, and spell out the letters and add them as a keyword under axes_description?

oeway commented 3 years ago

You can either flatten OR return a second tensor for the uncertainty scores. So I think this fits in the current proposal without big issues. I want to stress this again: It's totally fine if custom use cases need a bit of massaging to fit the spec, because otherwise we will need to change it for every new use-case.

I think we all agree that we don't want a lot of changes, but an easier strategy is to propose a fallback plan for cases that does not fit to our current definition. I would simply add "if our current axes letters doesn't fit your case, you are free to use any non-standard axes letter". Like that we don't have to exclude those non-standard models. And we provide a path to include new axis letters. Keep in mind that making changes to the model architecture is much more difficult, especially when the user already have a published model and they simply want to add an additional model.yaml to add to our model zoo. As a common strategy, we should avoid asking the developers to change their model in order to fit our spec, the spec should be more inclusive. We will suffer from spec stability issue only if we force an incomplete list and do not allow custom axes.

constantinpape commented 3 years ago

"if our current axes letters doesn't fit your case, you are free to use any non-standard axes letter"

But this is a free for all and can result in developers just using any letters they think fits because we effectively don't enforce anything. This is bad, e.g. for ilastik where the current axes letters more or less map to the axis tags that are used to label the axes. Again, given that we don't have any real outside contributions yet, I don't think we should adjust the spec to match any anticipate non-custom model. If we see that this is a real problem at some point we can still change it but for now I think it's much better to be strict here.

I am thinking can we make it a list of dictionary to contain more information about the axis?

I really like that proposal. I would keep the axes and axes_description separate though, because otherwise we break compatibility with the current version of the spec.

tomburke-rse commented 3 years ago

"if our current axes letters doesn't fit your case, you are free to use any non-standard axes letter"

But this is a free for all and can result in developers just using any letters they think fits because we effectively don't enforce anything. This is bad, e.g. for ilastik where the current axes letters more or less map to the axis tags that are used to label the axes. Again, given that we don't have any real outside contributions yet, I don't think we should adjust the spec to match any anticipate non-custom model. If we see that this is a real problem at some point we can still change it but for now I think it's much better to be strict here.

That's why we have to define very clearly what our axes are for and what it means if you use anything else. If we do that in the documentation and tutorials and someone still does only use optional stuff, then the model developer has to live with the consequences of not being consumable by use easily and not truly being on-spec.

I am thinking can we make it a list of dictionary to contain more information about the axis?

I really like that proposal. I would keep the axes and axes_description separate though, because otherwise we break compatibility with the current version of the spec.

Also like this one.

FynnBe commented 3 years ago

But this is a free for all ...

Totally agreed, let's avoid a free for all situation!

I really like that proposal. I would keep the axes and axes_description separate though, because otherwise we break compatibility with the current version of the spec.

I am actually in favor of keeping they axes, but treat it as axes description, e.g. 'extending' our axes key. For the standard cases we have now we can transform axes to axes description automatically when packaging. There is some converter logic already with the validator.

constantinpape commented 3 years ago

That's why we have to define very clearly what our axes are

Yes, I totally agree that we need to describe them better.

If we do that in the documentation and tutorials and someone still does only use optional stuff, then the model developer has to live with the consequences of not being consumable by use easily and not truly being on-spec.

I am still not convinced we should allow other letters. This would also mean that we can do much less static checks if axes are compliant with the spec.

We can discuss if we use something like SHOULD NOT contain any other letters instead (cf. https://tools.ietf.org/html/rfc2119). But then the wording should make clear that this is highly discouraged.

I am actually in favor of keeping they axes, but treat it as axes description, e.g. 'extending' our axes key. For the standard cases we have now we can transform axes to axes description automatically when packaging. There is some converter logic already with the validator.

@FynnBe sorry, I don't quite understand this comment.

FynnBe commented 3 years ago

@FynnBe sorry, I don't quite understand this comment.

I don't like the redundancy of having an axes field and a axes_description field. and the most straight forward field name is axes, but I, too, like the more extensive description of axes with a list of dicts. Right now the name field in Wei's poposed axes_description section not very meaningful and just a long version of the axes letter. If we could replace it with an enum like field (that should not be called name, as name is never an enum in our spec), e.g. axis, role or kind (kind is too similar to type though), then we can drop the axes letters and call the field that contains the axes description axes instead of axes_description

maybe something like:

# example with bug table 
outputs:
 - name: bug_table
   ## axes: btic # drop this
   shape: [1, null, 2]
   axes:  # now called 'axes'
       - axis/role: "batch"  # one of ["batch", "index", "time", "channel", "z", "y", "x"]
         type: "ordinal"  # would be default for "batch" and only valid type option
       ...
       - axis/role: "channel"
         description: "features for each detected bug, including size and age"
         type: "nominal"
         labels: ["size", "age"]
         units: ["meter", "year"] # we have really big and old bugs!

of course we can have the same discussion again if we want to allow not only "channel", but also "feature" and "category", but with description, type, labels, and units describing the nature of the axis, the role of it might be considered a "channel".

We could see how far we get with this. It is easily extendible in the future, while sticking very close to the minimum that we currently need...

oeway commented 3 years ago

But this is a free for all and can result in developers just using any letters they think fits because we effectively don't enforce anything. This is bad, e.g. for ilastik where the current axes letters more or less map to the axis tags that are used to label the axes.

I agree that "free for all" is not what we want, but I don't agree with your interpretation. We can work on the wording carefully and make it as a case that is not strictly prohibited, but highly discouraged.

My point is, having to changing the model architecture in order to fit our spec won't be an option. To move forward, I am totally fine with the strict definition if you can provide a solution for >3D non-image/non-spatial tensors. Since xyz and t are have very specific meaning, we will likely need b in all cases, the only left letter c won't work for any additional axes (e.g. the example I mentioned above that produce two tables in one tensor). This should be considered as a bug of the current design.

I proposed 3 solutions: 1) allow duplication so we can use several c 2) use a letter to represent undefined axis 3) make it optional 4) it's discouraged but allow custom axis and @constantinpape proposed 5) ask the user to change the network architecture. There seems also another proposal about 6) postpone it and add change the spec later

I think 5) is not an option because it's too much work for the developer 6) means we will reject models that do not fit to our current definition. I suggest that we either come up with a 7th solution or choose one from 1-4.

oeway commented 3 years ago

I like @FynnBe 's proposal about renaming axes_description to axes and absorb it as a key, e.g. role. For providing backward compatibility, we can indeed automatically generate the old axes key, and print out a deprecation warning. Sometime later we can completely remove that.

constantinpape commented 3 years ago

If you want we can put it all into axes, but then we have to follow semantic versioning and bump the version to 0.4.0 (even if we provide some automatic conversion; but that is independent of the spec description itself).

FynnBe commented 3 years ago

I proposed 3 solutions: 1) allow duplication so we can use several c 2) use a letter to represent undefined axis 3) make it optional 4) it's discouraged but allow custom axis and @constantinpape proposed 5) ask the user to change the network architecture. There seems also another proposal about 6) postpone it and add change the spec later

how about 7) allow for duplicate i to represent a multiindex

rationale: If i is involved the tensor represents some tabular data that might as well be indexed with multiindex.
advantage over 1): in case of combination with zyx axes it's clear what to display (there is only a single c) advantage over 2-4): everything is defined, no room for any degree of "free for all"

esgomezm commented 3 years ago

Hi there!

Sorry for being so strict here, but I think the whole discussion revolves around one of the core issues I have with the model zoo direction: we are still in an early beta stage and don't have any significant outside user adoption. That's fine, these things take time. But our priority should be making it work for the current scope of serving image data related models. And I think if we start to introduce very significant changes to the spec now to accomodate more custom use-cases we will either be back to 2 months or more of spec discussions or introduce changes that we don't really understand and that are not described properly. I am not arguing for completely freezing the spec as is and not thinking about custom use-cases at all. But I think right now the focus should be much more on stability.

Totally agree with @constantinpape. Making very small changes sounds reasonable but updating the specs to version 0.4.0 seems for me starting again with the adaptation of our software to the new version of the model.yaml, which in our case I'm not sure we can really go for it.

2) use a letter to represent undefined axis 3) make it optional 4) it's discouraged but allow custom axis and @constantinpape proposed

I agree with this but the only problem is that the CI checker couldn't evaluate this field or output an error message as it happens now with this image classification model. On the other hand, any model uploaded to the BioImage Model Zoo should be compatible with at least one software. If this can be ensured, then I do not see further problems.

... e.g. the example I mentioned above that produce two tables in one tensor

Not sure about pytorch, but in tensorflow, wouldn't this be a multi-ouptut (list of tensors, or list of outputs)? if so, you could specify it as independent outputs.

Trying to see this from a high perspective, I think the main problem here is dealing with whatever that is not an image of btxyzc and being able to display all casuistics. So I'd try to make this simpler at least for now.

What I see:

Can you display anything that is not an image with more than 2 dimensions? If so, it would imply specific post-processing or some more code (a table is 2D so displaying larger dimensions seems not obvious to me) --> software specific model, and maybe here the config field should be used together with the run_mode.
If the tensor is an image, I see no differences between having i, f or c or ccc (once x, y, z or t are specified) in terms of visualization unless you want to make very specific displays. This goes in hand with @oeway comments and could be a solution for >6D images ("I proposed 3 solutions: 1) allow duplication so we can use several c"). The only reason why it might be important to distinguish i, f or c is to use the model afterward in a larger workflow which is more than just processing an image with a model --> Is this in the scope (for the moment)?

Maybe I'm wrong, but I really think it would be much simpler if we knew first whether the tensor is an image or not, and this is regardless of any axis name (same issue here https://github.com/bioimage-io/spec-bioimage-io/issues/12). Then:

If it is an image, more axis letters than btxyzc might not be necessary if we allow duplicate names. (We can of course limit this to 6D and if we do not, models with larger dimensions would just not be compatible with some software but this shouldn't be a problem at all).
If it is not an image, how many dimensions are we ready to support as software? If the tensor is non-image/non-spatial, does it really matter if you use xy as long as it is not an image? Semantically it's different because you already know that it is not an image and in that case, all these rows/y/i/, columns/x/channels/f/d/... seem to be equivalent. If we do not want to add specific tags about image/non-image, we can just use different letters for images and non-images.

constantinpape commented 3 years ago

If we take a step back, I think we have a lot we agree on now: I think everyone is happy with a new form of the axes, similar to what @FynnBe wrote down in https://github.com/bioimage-io/spec-bioimage-io/issues/70#issuecomment-823465395. As far as I can see, the only two unclear points are:

Is it allowed to repeat axes letters (or what we will call role or similar in the new layoy)? Is it allowed for just certain letters like i?
Is it allowed to add new letters that are not part of the spec? (If so this should be discouraged).

I would vote in favor implementing the new field and then discussing 1. and 2. separately.

Edit: Sorry, I started this before @esgomezm posted, I will go through this in a second.

constantinpape commented 3 years ago

Making very small changes sounds reasonable but updating the specs to version 0.4.0 seems for me starting again with the adaptation of our software to the new version of the model.yaml, which in our case I'm not sure we can really go for it.

I think if the change in axes description is the only (breaking) change we introduce the effort of updating to the new spec would be minimal and we could also still support 0.3 and just print a deprecation warning in the model checker. I see this change as not being very large, but it is breaking with our current 0.3, so we need to increase the minor version; if we stop following semantic versioning rules the whole process will become very convoluted.

I agree with this but the only problem is that the CI checker couldn't evaluate this field or output an error message as it happens now with this image classification model. On the other hand, any model uploaded to the BioImage Model Zoo should be compatible with at least one software. If this can be ensured, then I do not see further problems.

I think with ic we can easily represent tables, which should fix the case you have linked to.

Not sure about pytorch, but in tensorflow, wouldn't this be a multi-ouptut (list of tensors, or list of outputs)? if so, you could specify it as independent outputs.

Yes, this would be the same in pytorch, so I agree that should just be independent outputs.

Maybe I'm wrong, but I really think it would be much simpler if we knew first whether the tensor is an image or not, and this is regardless of any axis name (same issue here #12). Then:

* If it is an image, more axis letters than `btxyzc` might not be necessary if we allow duplicate names. (We can of course limit this to 6D and if we do not, models with larger dimensions would just not be compatible with some software but this shouldn't be a problem at all).

* If it is not an image, how many dimensions are we ready to support as software? If the tensor is non-image/non-spatial, does it really matter if you use `xy` as long as it is not an image? Semantically it's different because you already know that it is not an image and in that case, all these `rows/y/i/`, `columns/x/channels/f/d/...` seem to be equivalent. If we do not want to add specific tags about image/non-image, we can just use different letters for images and non-images.

I personally like that we don't need to have different rules for different kind of tensors in the current proposal. This makes it easier to describe and verify the spec, because there are no conditionals we need to introduce. From the software perspective, I would hope that the new axes are pretty easy to intepret:

xy, cxy, xyz, cxyz (and the same with a t): this is image data, should be straight forward to display. By default c would be treated as a "stack of channels". Other display modes (like rgb / argb) would need to be specified in some other field; following the suggestion by @oeway maybe we could look how vega does this.
ic: this is table data
anything else mixing i and x or y etc: this is weird, I can probably not display it.

Note that this is just my interpretation, any consumer can implement this as they see fit and there is no need to have this as an explicit requirement in the spec (might still be good to add some description that helps interpretation thoguh).

oeway commented 3 years ago

how about 7) allow for duplicate i to represent a multiindex

@FynnBe I am ok with your proposal 7) about allow duplicating i.

Not sure about pytorch, but in tensorflow, wouldn't this be a multi-ouptut (list of tensors, or list of outputs)? if so, you could specify it as independent outputs.

@esgomezm yes, you can do that as a workaround, but that does mean we are impose some limitation on the model design. I think we should avoid having this type of constrains, designing models and train them is already challenge enough, we shouldn't force them to change the model in order to be listed in our model zoo.

Maybe I'm wrong, but I really think it would be much simpler if we knew first whether the tensor is an image or not, and this is regardless of any axis name (same issue here #12).

I agree with @constantinpape that we should not have conditional definition of the axes. However, we can think about adding a new field in the new definition of axes to explicitly say if the axis is spatial. For now I propose we use ordinal and nominal for the axis type, but we can also add spatial for example, for x, y, z although that will be a bit redundant to the role key.

I think if the change in axes description is the only (breaking) change we introduce the effort of updating to the new spec would be minimal and we could also still support 0.3 and just print a deprecation warning in the model checker.

I am fine with both minor or major bump of the version number. I think if we bump to 0.4.0, that should be also fine, we can let 0.3.0 and 0.4.0 co-exists. I mean especially later we will have multiple version co-exists anyway, upgrading the spec does not mean everyone have to upgrade their models, a deprecation warning would be enough.

As far as I can see, the only two unclear points are: Is it allowed to repeat axes letters (or what we will call role or similar in the new layoy)? Is it allowed for just certain letters like i? Is it allowed to add new letters that are not part of the spec? (If so this should be discouraged). I would vote in favor implementing the new field and then discussing 1. and 2. separately.

Sounds good to me.

FynnBe commented 3 years ago

Once the automatic packaging is up to speed, we can just "repackage" the current models and have the automatic conversion change the version number and in this case, axes content (minor or major, I don't mind either). So we wouldn't have to work with two versions (Lets avoid a python2 vs python3 type of situation at all cost^^). In the worst case someone would have to redownload the model to get the up to date version.

FynnBe commented 3 years ago

I would vote in favor implementing the new field and then discussing 1. and 2. separately.

I think this is a great idea. We already made a huge step forward on this axes discussion. A future change to duplicate an axis key or adding another one will be comparatively small change, so maybe we just move forward with what we definitely agree on for now...

FynnBe commented 3 years ago

Finalizing the 0.4.0:

# example with bug table 
outputs:
 - name: bug_table
   shape: [1, 5, null, 2]
   # test_tensor: <uri>  # ida for next version: move this mandatory test field here (instead of `test_inputs/outputs`)! 
   axes:  # now called 'axes'
       - role: batch  # one of ["batch", "index", "time", "channel", "z", "y", "x"]
         # all other fields are invalid for "batch"
       - role: time
         label: "space time" # but may not be a list!  (same for zyx and index)
         step: 10  # optional; default 1
         unit: ms  # optional; default: null; free text, but recommended standard from documented list, e.g. https://www.maplesoft.com/support/help/Maple/view.aspx?path=Units/SI
       - role: x
         step: 200  # default: 1
         unit: nm  # 
       - role: index
         label: bug
         # step: <here invalid>  # forbidden for role "index"
         # unit: <here invalid>  # forbidden for role "index"
       - role: channel
         description: "features for each detected bug, including size and age"  # optional; char limit: 128
         # type: "nominal"  # mandatory non-default (guess for old models: "nominal")  => discuss addition of 'types' separately...
         label: ["size", "age"]  # optional; if list, then needs to match shape (and unit if specified); 
         unit: ["meter", "year"]  # optional; if list, then needs to match shape (and label if specified); we have really big and old bugs!
         # step: <here invalid>  # must not be specified if label or unit is a list
 - name: bug_table2
      ...

constantinpape commented 3 years ago

Just to summarize the result on the meeting about this today:

we will first add i to our old-style axes and bump the PATCH, see #75.
then we will implement the non-controversial parts of the new axes, summarized in the comment above.
afterwards we can discuss the parts that are still controversial or unclear:
- do we allow duplication of roles? Only for the channel role?
- do we move test_input/output to the axes and call it test_tensor (or similar)
- do we introduce the type with options nominal or ordinal (which would for example constraint the label and unit fields).

oeway commented 3 years ago

@FynnBe What's the status with the axes spec?

FynnBe commented 3 years ago

For now I'm working on finalizing the 0.3.2 as we decided to implement the axes changes in 0.4.0...

FynnBe commented 2 years ago

suggestions for preliminary decisions:

afterwards we can discuss the parts that are still controversial or unclear:

do we allow duplication of roles? Only for the channel role?

no. currently we don't have use-case for this and it is easy to add later (but not easy to remove again)

do we move test_input/output to the axes and call it test_tensor (or similar)

yes. this change is autoconvertible (can be undone in any future version) and just makes a lot of sense! edit: we should move in sample_input/sample_output too then

do we introduce the type with options nominal or ordinal (which would for example constraint the label and unit fields).

we add it as an optional field for now. this way we don't have more work converting models, but can play around with the idea more and specify it for new models which then don't need manual conversion if we make type mandatory in the future.

FynnBe commented 2 years ago

Seems to me that breaking shape down per axis would make sense as well. We could move shape into axes as length (or size). default could be null. For the channel role it could be inferred if labels/units are given. for some roles we might allow the min/step combination... we could also make it mandatory for some roles (probably should for time, xyz; c if labels/units is not a list).

edit: this would also allow different reference tensors for each dimension edit2: halo should be moved in as well then

FynnBe commented 2 years ago

Question: We use the axes characters bcitxyz in many places (e.g. also in the pre-/postprocessing kargs). Should we generally accept b as an alias for batch, c for channel, etc..? Or do we from now on require to specify the axes as a list of strings with the full role name?

FynnBe commented 2 years ago

Question 2: Are there any use-cases where step make sense for a channel axis?

I am confused about the label/unit of the actual values vs the label/unit of the channel itself. All other axis exclusively describe the axis and do not inform about the values. This is different for the channel axis, like in the example above:

 label: ["size", "age"]  # optional; if list, then needs to match shape (and unit if specified); 
 unit: ["meter", "year"]

here the recorded values are in meter and in years. a step from the 'size' channel to the 'age' channel does not make sense.

For the example of multi-channel microscopy I would expect something like this:

label: [500nm, 650nm]  # the wavelengths used for each channel
unit: lx*s  # unit of the recorded values (here: lux seconds to record exposure)

which would be equivalent to

label: [500nm, 650nm]  # the wavelengths used for each channel
unit: [lx*s, lx*s]  # unit of the recorded values (here: lux seconds to record exposure)

This is now mixing a label per channel with the unit of the recorded values.

Using the channel axis like any other axis would look like this for example:

label: wavelength
unit: nm
step: 100

this would mean each channel is 100nm apart from the next. This would further require an offset... This does not make sense! Another problem here is that we then do not have a field to describe the unit of the recorded values.

I think what we had in mind so far is this 'mixed' use above. labels refer to the channels like for any other axis role, but unit always refers to the recorded values. If there is a single label for multiple channels e.g. 'rgb', the channels are implicitly named with an index, e.g. rgb[0], rgb[1], rgb[2]. (although the rgb example would be nicer as a list: ['r', 'g', 'b'])

Problem: for a single implicit channel there is no way of specifying the unit -> solution: make an explicit singleton axis to describe the channel

If we always want this mixed use then step never makes sense for the 'channel' axis...

edit: in short: if the channel axis is nominal there is no need for step/unit and if it is ordinal then step and unit are included in the label already and 'unit' refers to the recorded values.

FynnBe commented 2 years ago

maybe for the 'channel' axis we should rename unit to value_unit to make the distinction clearer. (and then forbid a channel axis to have unit and step). And if we come up with use-cases that require 'channel' to have unit/step we can add it without clashing with the value unit... (I'm assuming this is a reasonable solution for now...)

constantinpape commented 2 years ago

My 2cents:

do we want to do this now? I agree that it would be nice in principle, but we are back to having many moving parts if we do it together with the workflow spec.
In general: how about trying to re-use the axes metadata from ngff: https://ngff.openmicroscopy.org/latest/#axes-md