Closed FynnBe closed 7 months ago
I guess you don't want comments here again?
I guess you don't want comments here again?
no, of course comments are fine (I updated the poll, its options were cut off). Anything going beyond the scope of comments and minor changes should go to another dedicated issue, though.
I don't think we should allow for >6d (i.e. btczyx) data. Otherwise we need to start to reason about flattening on the spec level, which just introduces unnecessary complexity.
Further, if we introduce the additional axes_description
I don't really see the need for anything but c
. The difference between c
, i
, f
would be purely semantic anyways, so one might as well check the axes description for it.
Thanks @FynnBe for the summary, I am fine with the proposal and I am thinking an alternative solution to add c
, i
, f
is to make the axes
key optional non-image tensor input or output.
It totally make sense for image tensor to stay with btczyx
, but it becomes problematic if we force non-image tensor to use these 6 types of axes.
@constantinpape Let me give you some examples where c
won't be enough. In a single molecule localization model which takes an image as input and produce a list of xyz coordinates (e.g. deeploco), in this case you can use c
for the the three coordinate, but for along the location instances, we won't find a letter from btczyz
that fit well. We will likely need something like i
and define axes=bic
. Or we rather not define the axes, but let the consumer software to handle it. @FynnBe 's argument for enforcing a common standard for axes to enable interoperability for consumer software, however, I think it won't work for all cases. At one point, we will have to accept that fact that some models will only work with a very specifically designed consumer software. In this case, I think it won't make sense to force the developer to set, for example, axes=bxy
so other software can also read it and use an image to represent the xyz coordinates.
There are other cases which c
won't be enough, for example I am working on a model which takes an image and a protein sequence as input. For the protein sequence input, we use a 1-hot encoding to encode the 20 amino acid, so a sequence will be come a 2dimensional table. In that case, we will likely need to use c
for the encoding and i
for the instance of each amino acid letter.
More generally, for non-image data which has more than 3 channels, we will have the situation that b
, c
, i
won't be enough and we cannot use t
, x
, y
, z
.
I don't think we should allow for >6d (i.e. btczyx) data. Otherwise we need to start to reason about flattening on the spec level, which just introduces unnecessary complexity. Further, if we introduce the additional
axes_description
I don't really see the need for anything butc
. The difference betweenc
,i
,f
would be purely semantic anyways, so one might as well check the axes description for it.
some options following this only-c
-idea for tabular data like the 'big bug table' from https://github.com/bioimage-io/spec-bioimage-io/pull/69#issuecomment-822457618 you would recommend to (ab)use t,z,y or x? I suppose axes_description
would make it clear and for visualization we could:
agree to interpret a single spatial dimensions as specific visualization hints, e.g. lonely x
: = i
(index). 1d image -> table!
# example with big table
outputs:
- name: bug_table # table with individual bugs
axes: bxc # single `x` means index (-> table axes may not contain `y` or `z`)
shape: [1, 100, 2] # btw, we don't allow for an unknown size (yet), which would be important for tables, there are not 100 bugs on every page of code after all.
axes_description: ["batch", "bug", ["size", "age"]]
make visualization hints more explicit and add axes_type
or similar(, where only certain keys are allowed for certain axes keys, e.g. b
in ["batch"], x
in ["x-axis", "index"], c
in ["rgb", "image channel", "feature"]).
# example with big table
outputs:
- name: bug_table # table with individual bugs
axes: bxc
axes_type: ["batch", "index", "feature"]
shape: [1, 100, 2] # btw, we don't allow for an unknown size (yet), which would be important for tables, there are not 100 bugs on every page of code after all.
axes_description: ["batch", "bug", ["size", "age"]]
channel_type
instead of axes_type
: like 2, but only c
can have multiple meanings, but c
may occur more than once.
not allow for wide tables and limit it to a fixed (small) number of columns, only using the # example with fixed columns/ small table from https://github.com/bioimage-io/spec-bioimage-io/pull/69#issuecomment-822457618:
outputs:
# table with individual bugs
- name: bug_size # first column
axes: bf
shape: [1, 100]
...
- name: bug_age # second column
axes: bf
shape: [1, 100]
...
@constantinpape Let me give you some examples where
c
won't be enough. In a single molecule localization model which takes an image as input and produce a list of xyz coordinates (e.g. deeploco), in this case you can usec
for the the three coordinate, but for along the location instances, we won't find a letter frombtczyz
that fit well. We will likely need something likei
and defineaxes=bic
. Or we rather not define the axes, but let the consumer software to handle it. @FynnBe 's argument for enforcing a common standard for axes to enable interoperability for consumer software, however, I think it won't work for all cases. At one point, we will have to accept that fact that some models will only work with a very specifically designed consumer software. In this case, I think it won't make sense to force the developer to set, for example,axes=bxy
so other software can also read it and use an image to represent the xyz coordinates.There are other cases which
c
won't be enough, for example I am working on a model which takes an image and a protein sequence as input. For the protein sequence input, we use a 1-hot encoding to encode the 20 amino acid, so a sequence will be come a 2dimensional table. In that case, we will likely need to usec
for the encoding andi
for the instance of each amino acid letter.
both of these cases boil down to a table. I see that for 20 columns the above option 4. is already too tedious and unintuitive, but options 1-3 could match your use cases here.
More generally, for non-image data which has more than 3 channels, we will have the situation that
b
,c
,i
won't be enough and we cannot uset
,x
,y
,z
.
I see that we need a bit more flexibility than the hard btczyx
image tensor, but maybe it's best to restrict ourselves to the minimal extension that we really need right now and wait for a concrete use case to expand our definitions further. I am sure the ngff discussions will bring more insights over time, so let's aim at an intermediate solution that works for all models currently in and in-line for the model zoo.
@constantinpape Let me give you some examples where
c
won't be enough. In a single molecule localization model which takes an image as input and produce a list of xyz coordinates (e.g. deeploco), in this case you can usec
for the the three coordinate, but for along the location instances, we won't find a letter frombtczyz
that fit well. We will likely need something likei
and defineaxes=bic
. Or we rather not define the axes, but let the consumer software to handle it. @FynnBe 's argument for enforcing a common standard for axes to enable interoperability for consumer software, however, I think it won't work for all cases. At one point, we will have to accept that fact that some models will only work with a very specifically designed consumer software. In this case, I think it won't make sense to force the developer to set, for example,axes=bxy
so other software can also read it and use an image to represent the xyz coordinates.
Ok, I understand the use-case for i
now. It basically means that you have a list index for a dimension where the size of this dimension might not be known a priori, e.g. because you can have different number of instances in an image.
I agree that it makes sense to introduce this as a different letter, because it is semantically very different from a channel (where you know exactly the size of the channel dimension).
However, note that this implies that b
might not be naively applicable, because you might have different numbers of instances per sample, so you get a ragged array instead of a tensor. Could be solved by padding or restricting to b=1.
More generally, for non-image data which has more than 3 channels, we will have the situation that
b
,c
,i
won't be enough and we cannot uset
,x
,y
,z
.
I disagree here though: there is nothing fundamentally different of stacking 3 or 20 channels. For displaying purposes, channels would be treated as a stack by default (i.e. each channel is shown individually).
For other display modes, e.g. treating 3 channels as rgb, hints could be stored in the axes_descriptors
. There are so many different use cases here that it doesn't make much sense to standardize it further; hence text in axes_descriptors
are not restricted and a consumer can use it to store its custom visualisation syntax if necessary.
This can cover a wide variety of use-cases, e.g. also multi-dimensional axes can be flattened into c
without the need to formalize all this in the spec.
Again, I see the need for i
now, because it covers the use-case where data cannot be simply stacked.
I support @FynnBe's last comment. I think each of our software solutions can (and should) support the extra axes in the future for non-image data to visualize for the user. IF we encounter models that need even more dimensions, we can do one of these things:
But this is again a discussion we can do when we encounter such models. Now we have a clear example thanks to Wei's work.
Ok, there is some simultaneous posting going on :D.
To summarize my stance:
I would vote for option 1 in the comment from @FynnBe (https://github.com/bioimage-io/spec-bioimage-io/issues/70#issuecomment-823152000), but I do agree with @oeway that we should introduce a new axis letter i
.
This is semantically different from c
, because it describes "list-like" dimensions, for which the size of the dimension might not be known a priori.
Similar to @tomburke-rse, I think we should only introduce further letters if a use case that does not fit this spec at all (i.e. that is impossible or extremely inconvenient to express. It's totally fine if custom use cases need a bit of massaging to fit the spec, because otherwise we will need to change it for every new use-case.)
Edit: Also, I hope that in the long-term OME-NGFF will define a more general standard for image metadata that we will be able to use. This will likely still take some time, and the process will involve many stakeholders. So we need a workable spec that fits our needs, but we should not try to come up with a general purpose image metadata spec independently.
Hi @constantinpape and @FynnBe I think we need to clarify the difference with different channels vs different axes a bit more. Considering the following example for the single molecule localisation or tracking model where it produce a list of xyz coordinates. | x | y | z |
---|---|---|---|
1.2 | 100 | 200 | |
2.0 | 2.9 | 100 | |
100 | 200 | 300 |
For the horizontal axis, we may use the letter c
or f
, and for the vertical dimension, we are likely need i
instead of one of bctxyz
, because the semantical meaning is about each individual molecule (not because its size is undefined).
In this case, we will define the output tensor as a 3D axes
as bic
(deeploco for example allows predicting 256 molecules maximum, so the i
axes is padded to have length 256)
I disagree here though: there is nothing fundamentally different of stacking 3 or 20 channels. For displaying purposes, channels would be treated as a stack by default (i.e. each channel is shown individually).
Well, I think it won't work unless we force everyone to flatten axes that does not fall into our defined axes. For example, in the above example, if we now say the model not only output a table of x, y, z coordinate, but it will output an additional table that contains the uncertainty scores for each predicted numbers in the first table. Together with the batch dimension, we will now have a 4 dimensional output bic
plus another channel-like axis with length 2 (the first one is real coordinates, and the second one is error), the what letter should we use for the 4th dimension? We cannot use c
again because we do not allow duplication @FynnBe @constantinpape
Similar to @tomburke-rse, I think we should only introduce further letters if a use case that does not fit this spec at all (i.e. that is impossible or extremely inconvenient to express. It's totally fine if custom use cases need a bit of massaging to fit the spec, because otherwise we will need to change it for every new use-case.)
I agree with introducing more letters later, but I am thinking that we should provide a more gentle way for those future letters to land. I mean if we do not allow custom axes letter or leave the axes
key optional, models with those axes cannot enter the model zoo (because they won't pass the validator). I think we need a way to allow those models to enter the model zoo first, then offer the chance to discuss and absorb the new axis definition.
Also, I hope that in the long-term OME-NGFF will define a more general standard for image metadata that we will be able to use.
That's good! However, I think because OME-NGFF is a image format, there won't be much non-image axes involved, no?
In this case, we will define the output tensor as a 3D
axes
asbic
(deeploco for example allows predicting 256 molecules maximum, so thei
axes is padded to have length 256)
I think bic
is a good solution here. If you can pad here because you know the maximal size of i
in this case this is good;
but there might very well be cases for i
were this is not possible. E.g. a model that can do differentiable instance segmentation and returns properties for each instance in an image. But we don't need to discuss that in too much detail now; I just wanted to mention that this is a potential concern for the i
dimension.
Well, I think it won't work unless we force everyone to flatten axes that does not fall into our defined axes. For example, in the above example, if we now say the model not only output a table of x, y, z coordinate, but it will output an additional table that contains the uncertainty scores for each predicted numbers in the first table.
You can either flatten OR return a second tensor for the uncertainty scores. So I think this fits in the current proposal without big issues. I want to stress this again:
It's totally fine if custom use cases need a bit of massaging to fit the spec, because otherwise we will need to change it for every new use-case.
I agree with introducing more letters later, but I am thinking that we should provide a more gentle way for those future letters to land. I mean if we do not allow custom axes letter or leave the
axes
key optional, models with those axes cannot enter the model zoo (because they won't pass the validator). I think we need a way to allow those models to enter the model zoo first, then offer the chance to discuss and absorb the new axis definition.
I think we should avoid adding new letters as much as possible and only do this if there is really no other way. Hence, I think it's a better strategy to discourage this and encourage developers to use the current specification.
That's good! However, I think because OME-NGFF is a image format, there won't be much non-image axes involved, no?
There are already several use-cases under discussion that are similar to the issues here. But yes, it's primarily an image data format. But we are also called bioimage.io ;).
Sorry for being so strict here, but I think the whole discussion revolves around one of the core issues I have with the model zoo direction: we are still in an early beta stage and don't have any significant outside user adoption. That's fine, these things take time. But our priority should be making it work for the current scope of serving image data related models. And I think if we start to introduce very significant changes to the spec now to accomodate more custom use-cases we will either be back to 2 months or more of spec discussions or introduce changes that we don't really understand and that are not described properly. I am not arguing for completely freezing the spec as is and not thinking about custom use-cases at all. But I think right now the focus should be much more on stability.
@FynnBe Thanks for listing the options for axis_description
. I am thinking can we make it a list of dictionary to contain more information about the axis? Basically extend your option 1 with a time dimension:
# example with bug table
outputs:
- name: bug_table # table with individual bugs
axes: btic
shape: [1, null, 2] # use null if we don't know the length
axes_description:
- name: "batch"
type: "ordinal"
- name: "time"
description: "frame number in the movie"
type: "ordinal"
step: 3 # for ordinal axis, we can define the step/interval, for `x`, `y`, it can be pixel size
unit: "second" # the unit of time, for `x`, `y`, it can be `pixels` or `um`
- name: "index"
description: "index of the bugs"
type: "ordinal"
step: 1
- name: "features"
description: "features for each detected bug, including size and age"
type: "nominal"
labels: ["size", "age"]
units: ["meter", "year"] # we have really big and old bugs!
In the above example, here are the changes:
x
to i
because I think we should not change the meaning of x
since they are completely different.axes_description
becomes a dictionary where we can add more meta dataaxis
should have a type, either ordinal
(e.g. btxyz
) or nominal
(e.g. c
).ordinal
axis can set step
and unit
nominal
axis can set labels
for its column name and units
for the unit for each columnThese are inspired by Vega which I recommend to take a look. It defines very thorough grammar for interactive data visualization. It is highly relevant for us especially for non-image data. We can even directly connect our tensor directly to vega and allow interactive visualisation of tensors. For axes
, they have a table of properties which we can take inspirations from.
Another idea is that if we have the axes_description
field, shouldn't we simply absorb the single letter axes
, and spell out the letters and add them as a keyword under axes_description
?
You can either flatten OR return a second tensor for the uncertainty scores. So I think this fits in the current proposal without big issues. I want to stress this again: It's totally fine if custom use cases need a bit of massaging to fit the spec, because otherwise we will need to change it for every new use-case.
I think we all agree that we don't want a lot of changes, but an easier strategy is to propose a fallback plan for cases that does not fit to our current definition. I would simply add "if our current axes letters doesn't fit your case, you are free to use any non-standard axes letter". Like that we don't have to exclude those non-standard models. And we provide a path to include new axis letters. Keep in mind that making changes to the model architecture is much more difficult, especially when the user already have a published model and they simply want to add an additional model.yaml
to add to our model zoo. As a common strategy, we should avoid asking the developers to change their model in order to fit our spec, the spec should be more inclusive. We will suffer from spec stability issue only if we force an incomplete list and do not allow custom axes.
"if our current axes letters doesn't fit your case, you are free to use any non-standard axes letter"
But this is a free for all and can result in developers just using any letters they think fits because we effectively don't enforce anything. This is bad, e.g. for ilastik where the current axes letters more or less map to the axis tags that are used to label the axes. Again, given that we don't have any real outside contributions yet, I don't think we should adjust the spec to match any anticipate non-custom model. If we see that this is a real problem at some point we can still change it but for now I think it's much better to be strict here.
I am thinking can we make it a list of dictionary to contain more information about the axis?
I really like that proposal. I would keep the axes
and axes_description
separate though, because otherwise we break compatibility with the current version of the spec.
"if our current axes letters doesn't fit your case, you are free to use any non-standard axes letter"
But this is a free for all and can result in developers just using any letters they think fits because we effectively don't enforce anything. This is bad, e.g. for ilastik where the current axes letters more or less map to the axis tags that are used to label the axes. Again, given that we don't have any real outside contributions yet, I don't think we should adjust the spec to match any anticipate non-custom model. If we see that this is a real problem at some point we can still change it but for now I think it's much better to be strict here.
That's why we have to define very clearly what our axes are for and what it means if you use anything else. If we do that in the documentation and tutorials and someone still does only use optional stuff, then the model developer has to live with the consequences of not being consumable by use easily and not truly being on-spec.
I am thinking can we make it a list of dictionary to contain more information about the axis?
I really like that proposal. I would keep the
axes
andaxes_description
separate though, because otherwise we break compatibility with the current version of the spec.
Also like this one.
But this is a free for all ...
Totally agreed, let's avoid a free for all situation!
I really like that proposal. I would keep the
axes
andaxes_description
separate though, because otherwise we break compatibility with the current version of the spec.
I am actually in favor of keeping they axes
, but treat it as axes description, e.g. 'extending' our axes
key. For the standard cases we have now we can transform axes to axes description automatically when packaging. There is some converter logic already with the validator.
That's why we have to define very clearly what our axes are
Yes, I totally agree that we need to describe them better.
If we do that in the documentation and tutorials and someone still does only use optional stuff, then the model developer has to live with the consequences of not being consumable by use easily and not truly being on-spec.
I am still not convinced we should allow other letters. This would also mean that we can do much less static checks if axes
are compliant with the spec.
We can discuss if we use something like SHOULD NOT contain any other letters instead (cf. https://tools.ietf.org/html/rfc2119). But then the wording should make clear that this is highly discouraged.
I am actually in favor of keeping they
axes
, but treat it as axes description, e.g. 'extending' ouraxes
key. For the standard cases we have now we can transform axes to axes description automatically when packaging. There is some converter logic already with the validator.
@FynnBe sorry, I don't quite understand this comment.
@FynnBe sorry, I don't quite understand this comment.
I don't like the redundancy of having an axes
field and a axes_description
field. and the most straight forward field name is axes
, but I, too, like the more extensive description of axes with a list of dicts. Right now the name
field in Wei's poposed axes_description
section not very meaningful and just a long version of the axes letter. If we could replace it with an enum like field (that should not be called name
, as name
is never an enum in our spec), e.g. axis
, role
or kind
(kind
is too similar to type
though), then we can drop the axes letters and call the field that contains the axes description axes
instead of axes_description
maybe something like:
# example with bug table
outputs:
- name: bug_table
## axes: btic # drop this
shape: [1, null, 2]
axes: # now called 'axes'
- axis/role: "batch" # one of ["batch", "index", "time", "channel", "z", "y", "x"]
type: "ordinal" # would be default for "batch" and only valid type option
...
- axis/role: "channel"
description: "features for each detected bug, including size and age"
type: "nominal"
labels: ["size", "age"]
units: ["meter", "year"] # we have really big and old bugs!
of course we can have the same discussion again if we want to allow not only "channel"
, but also "feature"
and "category"
, but with description
, type
, labels
, and units
describing the nature of the axis, the role of it might be considered a "channel".
We could see how far we get with this. It is easily extendible in the future, while sticking very close to the minimum that we currently need...
But this is a free for all and can result in developers just using any letters they think fits because we effectively don't enforce anything. This is bad, e.g. for ilastik where the current axes letters more or less map to the axis tags that are used to label the axes.
I agree that "free for all" is not what we want, but I don't agree with your interpretation. We can work on the wording carefully and make it as a case that is not strictly prohibited, but highly discouraged.
My point is, having to changing the model architecture in order to fit our spec won't be an option. To move forward, I am totally fine with the strict definition if you can provide a solution for >3D non-image/non-spatial tensors. Since xyz
and t
are have very specific meaning, we will likely need b
in all cases, the only left letter c
won't work for any additional axes (e.g. the example I mentioned above that produce two tables in one tensor). This should be considered as a bug of the current design.
I proposed 3 solutions: 1) allow duplication so we can use several c
2) use a letter to represent undefined axis 3) make it optional 4) it's discouraged but allow custom axis and @constantinpape proposed 5) ask the user to change the network architecture. There seems also another proposal about 6) postpone it and add change the spec later
I think 5) is not an option because it's too much work for the developer 6) means we will reject models that do not fit to our current definition. I suggest that we either come up with a 7th solution or choose one from 1-4.
I like @FynnBe 's proposal about renaming axes_description
to axes
and absorb it as a key, e.g. role
. For providing backward compatibility, we can indeed automatically generate the old axes
key, and print out a deprecation warning. Sometime later we can completely remove that.
If you want we can put it all into axes
, but then we have to follow semantic versioning and bump the version to 0.4.0 (even if we provide some automatic conversion; but that is independent of the spec description itself).
I proposed 3 solutions: 1) allow duplication so we can use several
c
2) use a letter to represent undefined axis 3) make it optional 4) it's discouraged but allow custom axis and @constantinpape proposed 5) ask the user to change the network architecture. There seems also another proposal about 6) postpone it and add change the spec later
how about
7) allow for duplicate i
to represent a multiindex
rationale: If i
is involved the tensor represents some tabular data that might as well be indexed with multiindex.
advantage over 1): in case of combination with zyx
axes it's clear what to display (there is only a single c
)
advantage over 2-4): everything is defined, no room for any degree of "free for all"
Hi there!
Sorry for being so strict here, but I think the whole discussion revolves around one of the core issues I have with the model zoo direction: we are still in an early beta stage and don't have any significant outside user adoption. That's fine, these things take time. But our priority should be making it work for the current scope of serving image data related models. And I think if we start to introduce very significant changes to the spec now to accomodate more custom use-cases we will either be back to 2 months or more of spec discussions or introduce changes that we don't really understand and that are not described properly. I am not arguing for completely freezing the spec as is and not thinking about custom use-cases at all. But I think right now the focus should be much more on stability.
Totally agree with @constantinpape. Making very small changes sounds reasonable but updating the specs to version 0.4.0 seems for me starting again with the adaptation of our software to the new version of the model.yaml
, which in our case I'm not sure we can really go for it.
2) use a letter to represent undefined axis 3) make it optional 4) it's discouraged but allow custom axis and @constantinpape proposed
I agree with this but the only problem is that the CI checker couldn't evaluate this field or output an error message as it happens now with this image classification model. On the other hand, any model uploaded to the BioImage Model Zoo should be compatible with at least one software. If this can be ensured, then I do not see further problems.
... e.g. the example I mentioned above that produce two tables in one tensor
Not sure about pytorch, but in tensorflow, wouldn't this be a multi-ouptut (list of tensors, or list of outputs)? if so, you could specify it as independent outputs.
Trying to see this from a high perspective, I think the main problem here is dealing with whatever that is not an image of btxyzc and being able to display all casuistics. So I'd try to make this simpler at least for now.
What I see:
config
field should be used together with the run_mode
.i
, f
or c
or ccc
(once x
, y
, z
or t
are specified) in terms of visualization unless you want to make very specific displays. This goes in hand with @oeway comments and could be a solution for >6D images ("I proposed 3 solutions: 1) allow duplication so we can use several c").
The only reason why it might be important to distinguish i
, f
or c
is to use the model afterward in a larger workflow which is more than just processing an image with a model --> Is this in the scope (for the moment)?Maybe I'm wrong, but I really think it would be much simpler if we knew first whether the tensor is an image or not, and this is regardless of any axis name (same issue here https://github.com/bioimage-io/spec-bioimage-io/issues/12). Then:
btxyzc
might not be necessary if we allow duplicate names. (We can of course limit this to 6D and if we do not, models with larger dimensions would just not be compatible with some software but this shouldn't be a problem at all).xy
as long as it is not an image? Semantically it's different because you already know that it is not an image and in that case, all these rows/y/i/
, columns/x/channels/f/d/...
seem to be equivalent. If we do not want to add specific tags about image/non-image, we can just use different letters for images and non-images.If we take a step back, I think we have a lot we agree on now:
I think everyone is happy with a new form of the axes
, similar to what @FynnBe wrote down in https://github.com/bioimage-io/spec-bioimage-io/issues/70#issuecomment-823465395.
As far as I can see, the only two unclear points are:
axes
letters (or what we will call role
or similar in the new layoy)? Is it allowed for just certain letters like i
?I would vote in favor implementing the new field and then discussing 1. and 2. separately.
Edit: Sorry, I started this before @esgomezm posted, I will go through this in a second.
Making very small changes sounds reasonable but updating the specs to version 0.4.0 seems for me starting again with the adaptation of our software to the new version of the
model.yaml
, which in our case I'm not sure we can really go for it.
I think if the change in axes description is the only (breaking) change we introduce the effort of updating to the new spec would be minimal and we could also still support 0.3 and just print a deprecation warning in the model checker. I see this change as not being very large, but it is breaking with our current 0.3, so we need to increase the minor version; if we stop following semantic versioning rules the whole process will become very convoluted.
I agree with this but the only problem is that the CI checker couldn't evaluate this field or output an error message as it happens now with this image classification model. On the other hand, any model uploaded to the BioImage Model Zoo should be compatible with at least one software. If this can be ensured, then I do not see further problems.
I think with ic
we can easily represent tables, which should fix the case you have linked to.
Not sure about pytorch, but in tensorflow, wouldn't this be a multi-ouptut (list of tensors, or list of outputs)? if so, you could specify it as independent outputs.
Yes, this would be the same in pytorch, so I agree that should just be independent outputs.
Maybe I'm wrong, but I really think it would be much simpler if we knew first whether the tensor is an image or not, and this is regardless of any axis name (same issue here #12). Then:
* If it is an image, more axis letters than `btxyzc` might not be necessary if we allow duplicate names. (We can of course limit this to 6D and if we do not, models with larger dimensions would just not be compatible with some software but this shouldn't be a problem at all). * If it is not an image, how many dimensions are we ready to support as software? If the tensor is non-image/non-spatial, does it really matter if you use `xy` as long as it is not an image? Semantically it's different because you already know that it is not an image and in that case, all these `rows/y/i/`, `columns/x/channels/f/d/...` seem to be equivalent. If we do not want to add specific tags about image/non-image, we can just use different letters for images and non-images.
I personally like that we don't need to have different rules for different kind of tensors in the current proposal. This makes it easier to describe and verify the spec, because there are no conditionals we need to introduce.
From the software perspective, I would hope that the new axes
are pretty easy to intepret:
xy
, cxy
, xyz
, cxyz
(and the same with a t
): this is image data, should be straight forward to display. By default c
would be treated as a "stack of channels". Other display modes (like rgb / argb) would need to be specified in some other field; following the suggestion by @oeway maybe we could look how vega does this.ic
: this is table datai
and x
or y
etc: this is weird, I can probably not display it.Note that this is just my interpretation, any consumer can implement this as they see fit and there is no need to have this as an explicit requirement in the spec (might still be good to add some description that helps interpretation thoguh).
how about 7) allow for duplicate i to represent a multiindex
@FynnBe I am ok with your proposal 7) about allow duplicating i
.
Not sure about pytorch, but in tensorflow, wouldn't this be a multi-ouptut (list of tensors, or list of outputs)? if so, you could specify it as independent outputs.
@esgomezm yes, you can do that as a workaround, but that does mean we are impose some limitation on the model design. I think we should avoid having this type of constrains, designing models and train them is already challenge enough, we shouldn't force them to change the model in order to be listed in our model zoo.
Maybe I'm wrong, but I really think it would be much simpler if we knew first whether the tensor is an image or not, and this is regardless of any axis name (same issue here #12).
I agree with @constantinpape that we should not have conditional definition of the axes. However, we can think about adding a new field in the new definition of axes
to explicitly say if the axis is spatial. For now I propose we use ordinal
and nominal
for the axis type
, but we can also add spatial
for example, for x
, y
, z
although that will be a bit redundant to the role
key.
I think if the change in axes description is the only (breaking) change we introduce the effort of updating to the new spec would be minimal and we could also still support 0.3 and just print a deprecation warning in the model checker.
I am fine with both minor or major bump of the version number. I think if we bump to 0.4.0, that should be also fine, we can let 0.3.0 and 0.4.0 co-exists. I mean especially later we will have multiple version co-exists anyway, upgrading the spec does not mean everyone have to upgrade their models, a deprecation warning would be enough.
As far as I can see, the only two unclear points are: Is it allowed to repeat axes letters (or what we will call role or similar in the new layoy)? Is it allowed for just certain letters like i? Is it allowed to add new letters that are not part of the spec? (If so this should be discouraged). I would vote in favor implementing the new field and then discussing 1. and 2. separately.
Sounds good to me.
Once the automatic packaging is up to speed, we can just "repackage" the current models and have the automatic conversion change the version number and in this case, axes content (minor or major, I don't mind either). So we wouldn't have to work with two versions (Lets avoid a python2 vs python3 type of situation at all cost^^). In the worst case someone would have to redownload the model to get the up to date version.
I would vote in favor implementing the new field and then discussing 1. and 2. separately.
I think this is a great idea. We already made a huge step forward on this axes discussion. A future change to duplicate an axis key or adding another one will be comparatively small change, so maybe we just move forward with what we definitely agree on for now...
Finalizing the 0.4.0
:
# example with bug table
outputs:
- name: bug_table
shape: [1, 5, null, 2]
# test_tensor: <uri> # ida for next version: move this mandatory test field here (instead of `test_inputs/outputs`)!
axes: # now called 'axes'
- role: batch # one of ["batch", "index", "time", "channel", "z", "y", "x"]
# all other fields are invalid for "batch"
- role: time
label: "space time" # but may not be a list! (same for zyx and index)
step: 10 # optional; default 1
unit: ms # optional; default: null; free text, but recommended standard from documented list, e.g. https://www.maplesoft.com/support/help/Maple/view.aspx?path=Units/SI
- role: x
step: 200 # default: 1
unit: nm #
- role: index
label: bug
# step: <here invalid> # forbidden for role "index"
# unit: <here invalid> # forbidden for role "index"
- role: channel
description: "features for each detected bug, including size and age" # optional; char limit: 128
# type: "nominal" # mandatory non-default (guess for old models: "nominal") => discuss addition of 'types' separately...
label: ["size", "age"] # optional; if list, then needs to match shape (and unit if specified);
unit: ["meter", "year"] # optional; if list, then needs to match shape (and label if specified); we have really big and old bugs!
# step: <here invalid> # must not be specified if label or unit is a list
- name: bug_table2
...
Just to summarize the result on the meeting about this today:
i
to our old-style axes
and bump the PATCH, see #75.axes
, summarized in the comment above.role
s? Only for the channel role?test_input/output
to the axes
and call it test_tensor
(or similar)type
with options nominal
or ordinal
(which would for example constraint the label and unit fields).@FynnBe What's the status with the axes
spec?
For now I'm working on finalizing the 0.3.2 as we decided to implement the axes changes in 0.4.0...
suggestions for preliminary decisions:
afterwards we can discuss the parts that are still controversial or unclear:
- do we allow duplication of roles? Only for the channel role?
no. currently we don't have use-case for this and it is easy to add later (but not easy to remove again)
- do we move test_input/output to the axes and call it test_tensor (or similar)
yes. this change is autoconvertible (can be undone in any future version) and just makes a lot of sense!
edit: we should move in sample_input
/sample_output
too then
- do we introduce the type with options nominal or ordinal (which would for example constraint the label and unit fields).
we add it as an optional field for now. this way we don't have more work converting models, but can play around with the idea more and specify it for new models which then don't need manual conversion if we make type
mandatory in the future.
Seems to me that breaking shape down per axis would make sense as well.
We could move shape
into axes
as length
(or size
). default could be null
. For the channel
role it could be inferred if labels/units are given. for some roles we might allow the min
/step
combination...
we could also make it mandatory for some roles (probably should for time, xyz; c if labels/units is not a list).
edit: this would also allow different reference tensors for each dimension
edit2: halo
should be moved in as well then
Question:
We use the axes characters bcitxyz
in many places (e.g. also in the pre-/postprocessing kargs).
Should we generally accept b
as an alias for batch
, c
for channel
, etc..?
Or do we from now on require to specify the axes as a list of strings with the full role name?
Question 2:
Are there any use-cases where step
make sense for a channel axis?
I am confused about the label/unit of the actual values vs the label/unit of the channel itself. All other axis exclusively describe the axis and do not inform about the values. This is different for the channel axis, like in the example above:
label: ["size", "age"] # optional; if list, then needs to match shape (and unit if specified);
unit: ["meter", "year"]
here the recorded values are in meter and in years. a step from the 'size' channel to the 'age' channel does not make sense.
For the example of multi-channel microscopy I would expect something like this:
label: [500nm, 650nm] # the wavelengths used for each channel
unit: lx*s # unit of the recorded values (here: lux seconds to record exposure)
which would be equivalent to
label: [500nm, 650nm] # the wavelengths used for each channel
unit: [lx*s, lx*s] # unit of the recorded values (here: lux seconds to record exposure)
This is now mixing a label per channel with the unit of the recorded values.
Using the channel
axis like any other axis would look like this for example:
label: wavelength
unit: nm
step: 100
this would mean each channel is 100nm apart from the next. This would further require an offset... This does not make sense! Another problem here is that we then do not have a field to describe the unit of the recorded values.
I think what we had in mind so far is this 'mixed' use above. labels refer to the channels like for any other axis role, but unit always refers to the recorded values. If there is a single label for multiple channels e.g. 'rgb', the channels are implicitly named with an index, e.g. rgb[0], rgb[1], rgb[2]. (although the rgb example would be nicer as a list: ['r', 'g', 'b'])
Problem: for a single implicit channel there is no way of specifying the unit -> solution: make an explicit singleton axis to describe the channel
If we always want this mixed use then step
never makes sense for the 'channel' axis...
edit: in short: if the channel axis is nominal there is no need for step/unit and if it is ordinal then step and unit are included in the label already and 'unit' refers to the recorded values.
maybe for the 'channel' axis we should rename unit
to value_unit
to make the distinction clearer. (and then forbid a channel axis to have unit
and step
). And if we come up with use-cases that require 'channel' to have unit/step we can add it without clashing with the value unit...
(I'm assuming this is a reasonable solution for now...)
My 2cents:
Polling does not seem to work in the PR... Maybe here we can vote on the proposal in https://github.com/bioimage-io/spec-bioimage-io/pull/69#issuecomment-822457618 and discuss any changes of this proposal specifically. Please open separate issues for completely different proposals.
edited: poll options were too long and cut off