Closed twiecki closed 3 years ago
Would specifying shape just be a redundant way of specifying size (+ some informative message if the shape didn't make sense)?
shape
and dims
are fully redundant to each other.
shape
and size
are only directly comparable given the other parameters such as cov=eye(3)
.
Because shape ≡ dims
and I think we all agree that dims
was a great improvement of our API, I don't think we can/need/should make the internal logic go away.
However, we might still want to nudge everybody away from it.
One idea that circulated in our chat discussion was to slightly change what shape
does in order to make it a new feature:
If distribution parameters are symbolic, it's not immediately clear how the RV looks like:
x = MvNormal(cov=cov_prior)
Specifying shape
makes the model code easier to read, but makes it very rigid w.r.t. out-of-sample prediction:
x = MvNormal(cov=cov_prior, shape=(3,)) # can no longer be broadcasted into a different size
It's the same as
x = MvNormal(cov=cov_prior)
aesara.tensor.specify_shape(x, (3,)) # this Op needs to become a function output to be effective!
We could implement shape
such that the SpecifyShape
Op
is automatically added, creating a more shape
-rigid model.
The plus point then is that inputs to specify_shape
could even be symbolic, creating asserts such as: "This RVs 2nd dimension must be the same size as that dim from the other RV."
I like the idea of distinguishing batch from event dimensions. This is currently one of the main pain points (for me) with ArviZ converters. The variables in posterior predictive, observed data and log likelihood have the same shape for univariate distributions, but the log likelihood has a subset of the shape for multivariate.
ArviZ currently does a quick and dirty check when retrieving the log likelihood (for all converters, as AFAIK none of the libraries properly supported distinguish on this) which is dropping the last dimensions when a shape mismatch is found to see if this solves things.
The example above would therefore break ArviZ when using custom dims because the event dim is not the last.
coords=dict(town=ABC, year=[2020, 2021])
X = MvNormal(cov=eye(3), shape=(2, 3))
Y = MvNormal(cov=eye(3), dims=("year", "town"))
Z = MvNormal(cov=eye(3) , size=2)
would work. If the variable Y
were observed, it would currently error out at the idata conversion step due to the shape specification.
I don't really know if this would help (I guess we can internally make sure the event dims are the last if using size
) nor how to make it better, I just wanted to share that as it is a new feature with close to no docs and literally no examples yet. Possible improvements on that for the v4 converter would be great
Shape was definitely one of those "invisible" features that took me some time to figure out. Looking at questions in the discourse, it seems to crop up frequently. The other that seems to confuse people (although it's just theano borrowing from the numpy API) is fancy indexing:
indexes = np.array([0, 1, 2, 3, 0, 1, 2, 3])
a = pm.Normal('a', 0, 1, shape=4)
like = pm.Normaly('like', a[indexes], 1, observed=data)
The new coords seems even more enigmatic to me. Would be nice to have a doc notebook just walking through the different shape/size and indexing possibilities in pymc3. I know we have some examples, but I think (without proof) that they tend to be difficult to find.
Edit: This was brought up here https://github.com/pymc-devs/pymc3/issues/4007
Shapes are some of the hardest things to get correctly and I'm very concerned that users might shy away from
v4
because of that.
In v4
users are no longer required to determine the resulting shapes of the objects they create. In v3
a user is forced to do this themselves, and, as we all know, it's confusing and leads to numerous errors. Those are actual reasons for users to shy away from PyMC. I don't understand how asking a user to do less is off-putting.
Also, I don't see any real advantages to porting the shape
keyword—aside from backward compatibility, but, since we're explicitly creating a new version, backward compatibility breaks like this are completely justified.
More importantly, the shape
keyword has a few very real disadvantages, and we've been dealing with them for quite a while now (e.g. it's confusing, error prone, limits the flexibility and reuse of our models, etc.), so I'm very much against propagating this key issue into the next version.
As @michaelosthege mentioned, if one wants to constrain the shape of a variable, they can use specify_shape
, and we could implement a shape
keyword option using that; however, a user that very specifically wants to restrict the shape of a random variable can just as easily apply specify_shape
after creating said variable.
The problem I see with advertising a shape
keyword in v4
is that v3
users who have learned this bad habit will likely end up doing it in v4
, giving rise to all those old issues and limitations. This is also likely to happen when new users attempt to recreate models written for v3
.
This shape
keyword is both the interface to—and cause of—one of the biggest problems we're fixing in v4
, so it's the last thing we want to unnecessarily shoehorn back into it.
I suppose I would need to be sold with examples where size is superior. If we can market this as a big improvement it's a different story.
On Sat, Mar 20, 2021, 00:07 Brandon T. Willard @.***> wrote:
Shapes are some of the hardest things to get correctly and I'm very concerned that users might shy away from v4 because of that.
In v4 users are no longer required to determine the resulting shapes of the objects they create. In v3 a user is forced to do this themselves, and, as we all know, it's confusing and leads to numerous errors. Those are actual reasons for users to shy away from PyMC. I don't understand how asking a user to do less is off-putting.
Also, I don't see any real advantages to porting the shape keyword—aside from backward compatibility, but, since we're explicitly creating a new version, backward compatibility breaks like this are completely justified.
More importantly, the shape keyword has a few very real disadvantages, and we've been dealing with them for quite a while now (e.g. it's confusing, error prone, limits the flexibility and reuse of our models, etc.), so I'm very much against propagating this key issue into the next version.
As @michaelosthege https://github.com/michaelosthege mentioned, if one wants to constrain the shape of a variable, they can use specify_shape, and we could implement a shape keyword option using that; however, a user that very specifically wants to restrict the shape of a random variable can just as easily apply specify_shape after creating said variable.
The problem I see with advertising a shape keyword in v4 is that v3 users who have learned this bad habit will likely end up doing it in v4, giving rise to all those old issues and limitations. This is also likely to happen when new users attempt to recreate models written for v3.
This shape keyword is both the interface to—and cause of—one of the biggest problems we're fixing in v4, so it's the last thing we want to unnecessarily shoehorn back into it.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/pymc-devs/pymc3/issues/4552#issuecomment-803182883, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFETGEI3VAK6X3WJ7CDODTTEPKMZANCNFSM4ZKXGZTQ .
I suppose I would need to be sold with examples where size is superior.
Here's a very simple example that works in my local #4551 branch of v4
:
import numpy as np
import pymc3 as pm
import aesara
import aesara.tensor as aet
data = np.random.normal(size=(1, 3))
X_at = aesara.shared(data)
with pm.Model() as m:
beta = pm.MvNormal("beta", mu=aet.zeros((X_at.shape[1],)), cov=aet.eye(X_at.shape[1]))
Y = pm.MvNormal("Y", mu=X_at.dot(beta), cov=aet.eye(X_at.shape[0]), size=(2,))
with m:
priorp_samples = pm.sample_prior_predictive(samples=10)
>>> priorp_samples["beta"].shape
(10, 3)
>>> priorp_samples["Y"].shape
(10, 2, 1)
new_data = np.random.normal(size=(2, 4))
X_at.set_value(new_data)
with m:
priorp_samples = pm.sample_prior_predictive(samples=10)
>>> priorp_samples["beta"].shape
(10, 4)
>>> priorp_samples["Y"].shape
(10, 2, 2)
This would not even be remotely possible if the shapes of beta
and/or Y
were fixed via a shape
keyword.
Otherwise, I invite everyone to imagine how this implementation would change if symbolic values were to be used in the shape
value. I guarantee that the change wouldn't be an improvement in any way.
OK that's awesome, I get it now. Shapes are automatically propagated through.
On Sat, Mar 20, 2021, 01:24 Brandon T. Willard @.***> wrote:
I suppose I would need to be sold with examples where size is superior. If
Here's a very simple example that works in my local #4551 https://github.com/pymc-devs/pymc3/pull/4551 branch of v4:
import numpy as npimport pymc3 as pm import aesaraimport aesara.tensor as aet
data = np.random.normal(size=(1, 3))X_at = aesara.shared(data)
with pm.Model() as m: beta = pm.MvNormal("beta", mu=aet.zeros((X_at.shape[1],)), cov=aet.eye(X_at.shape[1])) Y = pm.MvNormal("Y", mu=X_at.dot(beta), cov=aet.eye(X_at.shape[0]), size=(2,)) with m: priorp_samples = pm.sample_prior_predictive(samples=10)
priorp_samples["beta"].shape (10, 3)>>> priorp_samples["Y"].shape (10, 2, 2)
new_data = np.random.normal(size=(2, 4))X_at.set_value(new_data) with m: priorp_samples = pm.sample_prior_predictive(samples=10)
priorp_samples["beta"].shape (10, 4)>>> priorp_samples["Y"].shape (10, 2, 2)
This would not even be remotely possible if the shapes of beta and/or Y were fixed via a shape keyword.
Otherwise, I invite everyone to imagine how this implementation would change if symbolic values were to be used in the shape value. I guarantee that the change wouldn't be an improvement in any way.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/pymc-devs/pymc3/issues/4552#issuecomment-803205685, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFETGCBMNK2SSAX33TAAKDTEPTNFANCNFSM4ZKXGZTQ .
The fact that this use case is now possible is awesome! At the same time it's not something we need from every model.
In 95 % of the models I write, I'm more concerned with the inference on the data I have, not the out-of-sample prediction using shared variables.
For inference the dims
kwarg is incredibly useful. It just happens that it's quite similar to shape
. Our discussion should be titled "dims and size" not "shape vs size".
In 95 % of the models I write, I'm more concerned with the inference on the data I have, not the out-of-sample prediction using shared variables.
Don't forget that, because we can do this shared variable thing, it also means that people who run the same model on many different datasets will no longer need to recreate their Model
s and recompile their respective log-likelihood graphs to C/whatever else.
Don't forget that, because we can do this shared variable thing, it also means that people who run the same model on many different datasets will no longer need to recreate their
Model
s and recompile their respective log-likelihood graphs to C/whatever else.
Sure, that's great. Let's figure out how to make dims
work with that.
dims
are already sort of "symbolic". They are placeholders. Can we translate them into symbolic elements for size
?dims
that refer to implied dimensions? (That's actually very common. e.g. https://github.com/rtcovidlive/rtlive-global/blob/master/rtlive/model.py#L168-L176)coords
when the value of a shared variable is changed? Can we pass new values for the coords via something like pm.Data().set_value(val, coords=...)
?From a naive/conceptual point of view, I see dims
fitting better the size
approach. Currently dims
takes the info of coords
to get the length of the dimension, but I actually don't think this is ideal.
Whenever I have to do cross-validation, I often use shape
to define the model and then when converting to inferencedata, I use dims
but no coords
. I really don't care about coordinate values, and having to add extra code to sort coord values at model creation is work I don't really need and won't use, whereas having the correct dims in place helps with automatic alignment and broadcasting when postprocessing with xarray.
It would be great to pair named dimensions with the size argument and then define the coordinate values at sampling/conversion time which is what feels more natural. In this crossvalidation/out of sample sampling, the dimension id generally is part of the model and I think it would be useful to allow having this info still there like we do now with dims. The coordinate values however, are not part of the model and I think they should be better off not being tied to a model but to data/sampling realizations
The benefit I see is the shape propagation, not necessarily that we can switch things for prediction. With shape
or dim
I always have to specify the full shape of every tensor, while with size
I just have to remember the dimension I want to batch over, which is much simpler. I think dims
could be changed in this way where I'm just specifying which variable I want to batch over. I changed the example below: we have two experiments/replications on a drug with different dose levels administered to multiple subjects.
import numpy as np
import pymc3 as pm
import aesara
import aesara.tensor as aet
n_doses = 3
n_subjects = 5
data = np.random.normal(size=(n_doses, n_subjects))
X_at = aesara.shared(data)
with pm.Model() as m:
beta = pm.MvNormal(
"beta",
mu=aet.zeros((n_doses,)),
cov=aet.eye(n_doses))
Y = pm.MvNormal(
"Y",
mu=X_at.dot(beta),
cov=aet.eye(n_states),
dims=("experiments"))
Without this, the likelihood would have to look like:
Y = pm.MvNormal(
"Y",
mu=X_at.dot(beta),
cov=aet.eye(n_states),
dims=("experiments", "doses", "subjects"))
Maybe we want to rename dim
in the above case to plate_dim
, batch_dim
, or rep_dim
to avoid confusion.
I don't see how this reduces confusion. On the contrary, I find plate_dim
/batch_dim
/rep_dim
and yes, also size
more confusing than either shape
or dim
.
There are tons of cases where you need to know the actual shape
or at least the order in which the dims
appear:
*
, +
, dot
, ...)pm.model_to_graphviz
)traces
/InferenceData
...just to name a few.
This whole shape propagation/broadcasting is nice if you need it and if you know how it works.
For the 90 % other situations it's immensely important that model code is expressive and doesn't require you to evaluate in-line mathematical expressions such as X_at.dot(beta)
to figure out what the dims
of a variable are.
Without this, the likelihood would have to look like:
Y = pm.MvNormal( "Y", mu=X_at.dot(beta), cov=aet.eye(n_states), dims=("experiments", "doses", "subjects"))
I think this RV is waay easier to work with. This way I won't be surprised that idata.posterior.Y.ndim == 3
and I can select idata.posterior.Y.sel(doses=5)
and so on.
dims
and coords
are the most useful thing we added in recent releases. Let's not vandalize that in favor of edge cases.
Having shape
and size
in addition to dims
is incredibly confusing. I can see the argument for one or the other, but not both. If we move to size
then shape
ought to be deprecated and removed.
There are tons of cases where you need to know the actual
shape
or at least the order in which thedims
appear:
It sounds like you're confusing shape with dimensions. In Theano/Aesara the number and (operation-induced) order of dimensions cannot be symbolic, so, when it comes to dimensions nothing changes. If one wants to provide labels for the dimensions, they can; the use of a size
or shape
parameter has no effect on this.
- Operations involving multiple n-dimensional tensors (
*
,+
,dot
, ...)
These operations do not require knowledge of the objects' exact shapes, and this is evidenced by the example I gave above, which uses these exact operations (both implicitly and explicitly).
- Plotting a plate-model (
pm.model_to_graphviz
)
Anything that involves plotting is something that exists outside of the PPL, and—thus—outside of the scope/relevance of symbolic shape considerations. In other words, if one is at the point of plotting something, then they necessarily have a trace with concrete non-symbolic shapes, so this shape
vs. size
subject is no longer relevant.
- Creating and working with
traces
/InferenceData
Again, if one has a trace, then all the shapes are already worked out and known, so this discussion isn't relevant.
This whole shape propagation/broadcasting is nice if you need it and if you know how it works.
The point of the change from shape
to size
is that a user doesn't need to know how any of that works, so we're not going to demand that users specify the entire shape. Really, it's the shape
parameter that forces users to know how shape propagation and broadcasting works, because that's exactly what they need to do in order to come up with a valid shape
value.
Worse yet, the shape propagation and broadcasting is always needed and used. The difference is that under size
we use it to the user's advantage, and under shape
we don't (while also forcing models to be unnecessarily shape-specific).
For the 90 % other situations it's immensely important that model code is expressive and doesn't require you to evaluate in-line mathematical expressions such as X_at.dot(beta) to figure out what the dims of a variable are.
What is this 90%? Why would one need to evaluate an expression to figure out the dimensions? If one can specify an exact shape—for the shape
parameter—in the first place, wouldn't they have all the information they need about these dimensions? Are you implying that one would need to evaluate something in order to use a size
parameter?
Also, there's nothing expressive about the use of a shape
parameter. At best, it's simply more explicit, but it's also unnecessary, overly verbose, and objectively restrictive and error prone. In contrast to your statements above, the shape
parameter has alway been the source of many errors, because it forces the users to know and perform the shape propagation and broadcasting themselves.
I think this RV is waay easier to work with. This way I won't be surprised that
idata.posterior.Y.ndim == 3
and I can selectidata.posterior.Y.sel(doses=5)
and so on.
dims
andcoords
are the most useful thing we added in recent releases. Let's not vandalize that in favor of edge cases.
Once again, these statements are conflating the DSL/PPL aspects of PyMC3 with the data it produces. idata
is a trace containing samples from a model that was already compiled, so it has nothing to do with size
vs. shape
parameters.
If one wants to add dimension names to a random variable, that can be done at any point, and is not affected by manually vs. automatically computed shapes. As I stated before, the number of dimensions cannot be changed in Theano/Aesara; the only thing that is automatically computed is the length/size of each dimension. If one can specify a valid shape
parameter, then they already know how many dimensions there are and what each dimension is.
At a point, arguing for this shape
parameter is the same as saying "Users shouldn't specify log-likelihood graphs; they should provide all the log-likelihood values".
We have obviously very different opinions about shape
, size
and dims
.
I am personally not particularly attached to shape
, but I believe that making the v3
→v4
transition as smooth as possible should be our topmost priority. The faster users adopt v4
, the better for the entire project. We can continue to nudge them towards different APIs throughout v4.x
versions.
What is this 90%? Why would one need to evaluate an expression to figure out the dimensions? If one can specify an exact shape—for the shape parameter—in the first place, wouldn't they have all the information they need about these dimensions?
Knowing the dims
of a model variable while writing the model is one thing, but days or weeks later, or when looking at someone's model code, nobody wants to evaluate the chain of expressions in their head before they can tell the dims
of a model variable!
→ Passing shape=
is the old, dumb way of dealing with this problem.
→ Writing comments is the easy way, but we all know that code & comments diverge over time
→ Adding SpecifyShape
Op
s is overly verbose and unnecessarily restricts the model
→ Passing dims=
is the more flexible and elegant way to do it.
The key point being: Verbosity is often a good thing! Verbosity should not be required, but it should seamlessly integrate with the API.
Now here's maybe a way out of this dilemma. What if we start using Ellipsis
to substitute implied dimensions:
pm.MvNormal(mu=[1,2,3], cov=np.eye(3), shape=(2, 3))
# is the same as
pm.MvNormal(mu=[1,2,3], cov=np.eye(3), shape=(2, ...))
# is the same as
pm.MvNormal(mu=[1,2,3], cov=np.eye(3), size=2)
The same with dims
:
coords = {
year=[2020, 2021],
drink=["Water", "Orange Juice", "Apfelschorle"],
}
pm.MvNormal(mu=[1,2,3], cov=np.eye(3), dims=("year", "drink")) # explicit, verbose and shape-flexible!
# is the same as
pm.MvNormal(mu=[1,2,3], cov=np.eye(3), dims=("year", ...)) # implicit and ndim-flexible
Now here's maybe a way out of this dilemma. What if we start using
Ellipsis
to substitute implied dimensions:
If that helps, we could do it.
Remember, we can also strip those "base" dimensions from a user-specified shape
, since we'll be able to tell how many dimensions are implied by the distribution parameters. In effect, we would be extracting the size
parameter from a shape
and ignoring the redundant/unused parts. Naturally, we would need to emit a warning, telling the user that we've discarded the rest.
The faster users adopt v4, the better for the entire project. We can continue to nudge them towards different APIs throughout v4.x versions.
I agree with the first point, but not so much with the second. We have an opportunity to make a clean slate and break virtually anything, and I think we should make the most of of this for two reasons. First is that it is less work for us, starting the new major version with multiple alternatives of doing things and having to handle deprecations from the very beginning seems like extra and unnecessary work to me. Second is that I think users will rather have things break once and then last for multiple versions than having to be constantly paying attention to the deprecations and api changes between minor versions.
The key point being: Verbosity is often a good thing! Verbosity should not be required, but it should seamlessly integrate with the API.
This would be amazing, I like the ellipsis option with dims, however I don't think it would "keep" the dimensions from the variable who is actually defining them when broadcasting, so we'll have to take that into account when converting to ArviZ, otherwise arviz will see arrays with n dims and only m names for its dims and error out.
We have an opportunity to make a clean slate and break virtually anything, and I think we should make the most of of this for two reasons.
You're right. What I meant is more along the lines of doing warn("This API will stop working in 4.1. Do this instead: ...")
or even raise
with the same message.
For many of these (new) APIs I don't trust us hitting the nail on the head in the first try. With warn("...will stop working in 4.1...")
we can still roll back our decision if we need to.
[...] "keep" the dimensions from the variable who is actually defining them when broadcasting [...]
Yes, with explicit dims=("year", "drink")
that's easier. We'll need some logic to figure out the dim names from the RV inputs, so we can autocomplete the ...
.
I guess in model.add_random_variable
we should the Ellipses
with implied dims and then assert var.ndim == len(dims)
.
One option (also looking at #4565) could be to add a dims
property to the RandomVariable
(or even deeper in Aesara?) that is either supplied upon instantiation, or inferred if accessed (kinda like infer_shape
). Then we wouldn't have to deal with named dims
in the PyMC3 model at all.
This could (with the could being quite the supposition) be solved by using xarray as a backend, otherwise I fear we will end up making xarray from scratch on aesara. cc @ColCarroll @canyon289 as they are listed as possible mentors on that project and may have better ideas on this end.
If Aesara were a dynamic computation library, it would be quite easy to get xarray to work with it, xarray doesn't really do any computation, only handles dims, coords and aligns the underlying arrays, everything else works as long as the underlying data structure implements the __numpy_func__
and __numpy_ufunc__
.
For static computation I think it won't be as straighforward, but it could be possible anyway, a la having .compute()
like we do with dask xarray ojbects call .eval()
in Aesara or something like that.
Seems we all agree that size
is better and should ultimately deprecate shape
- I think @brandonwillard's suggestion is the least disruptive one for user, and we can phase out shape
gently:
Remember, we can also strip those "base" dimensions from a user-specified shape, since we'll be able to tell how many dimensions are implied by the distribution parameters. In effect, we would be extracting the size parameter from a shape and ignoring the redundant/unused parts. Naturally, we would need to emit a warning, telling the user that we've discarded the rest.
Re size
naming, I am extremely torn between "Explicit is better than implicit" (we should name it like batch_shape
or plate_shape
) and "Simple is better than complex" (size
is simple and consistent with numpy).
Right now I am 51% prefer size
I don't think there is agreement here at all that size is better than shape. @aseyboldt, Michael and I prefer the explicit shape.
Personally I like the approach with Ellipses best where we can have optional shape propagation but the default stays explicit, backwards compatible, and compatible with dims.
On Sat, Mar 27, 2021, 08:38 Junpeng Lao @.***> wrote:
Re size naming, I am extremely torn between "Explicit is better than implicit" (we should name it like batch_shape or plate_shape) and "Simple is better than complex" (size is simple and consistent with numpy)
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/pymc-devs/pymc3/issues/4552#issuecomment-808680855, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFETGAU32FATQCDIXGK6LLTFWDPHANCNFSM4ZKXGZTQ .
I hope it's clear to everyone that the ellipses approach is just a longer version of size
(e.g. shape=(2, 3, Ellipsis)
is equivalent to size=(2, 3)
)
Yes, that's what I like about it. In most cases I think users will find shape/dims easier. Advanced users can use Ellipses. That way we maintain backwards compatibility but have a cool new feature rather than explaining users a subtle difference in how we now handle shapes.
On Sat, Mar 27, 2021, 09:58 Brandon T. Willard @.***> wrote:
I hope it's clear to everyone that the ellipses approach is just a longer version of size (e.g. shape=(2, 3, Ellipsis) is equivalent to size=(2, 3))
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/pymc-devs/pymc3/issues/4552#issuecomment-808695246, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFETGBLDQORQNJ7QS7PNLTTFWM3XANCNFSM4ZKXGZTQ .
This is what we agreed on after discussing our last lab meeting:
Ellipses
to allow for this new functionalityWe will implement shape and dims in a backwards compatible way in v4
I thought that was dims
only :thinking: I think shape is much more limiting than dims and having both size and shape will be even more confusing than breaking some code and having users update to size or dims.
Don't we translate dims
to shape
internally?
I think I got lost during the meeting, nevermind. I am probably too xarray biased for my own good.
So to clarify:
size
will not be available to users, only shape
and to use size one should use shape=(2, 3, Ellipsis)
.dims
will convert to shape
then to size
? shape
without ellipsis?coords
defined at model creation? Will changing such coords afterwards (even after having already sampled) revert back to the actual shapes of the variables in the model?I think I got lost during the meeting, nevermind. I am probably too xarray biased for my own good.
Here's how I see it.
So to clarify:
1. `size` will not be available to users, only `shape` and to use size one should use `shape=(2, 3, Ellipsis)`.
I think that's the best option, just to remove confusion and too many options for doing the same thing, although we'd probably be artificially removing that capability. It's a good question.
2. `dims` will convert to `shape` then to `size`?
I think yes.
3. will we discourage the use of `shape` without ellipsis?
No, Ellipses
is an advanced feature for those who know it and need it. We mostly think that explicit shape specification is better.
4. Will dims continue to rely on `coords` defined at model creation? Will changing such coords afterwards (even after having already sampled) revert back to the actual shapes of the variables in the model?
I assume that we currently look the shapes up during model definition, right? Ideally we can make that dynamic then but not sure how easy that would be. Ideally we move the whole names business into aesara eventually: https://github.com/pymc-devs/aesara/issues/352
5. Can someone suggest what do they expect ArviZ to do when ellipsis are present?
That could be tricky, ideally we somehow infer this during model spec but not sure how easy that would be. I'm not sure how it can be solved :-/.
closed by #4696
@brandonwillard is proposing to remove the current
shape
kwarg and replace it withsize
which works more similar to the waynumpy
handles it. The difference is thatsize
only specifies the batch dimension, while the event dimension is taken from the distribution. For univariate distributions, there is no difference, but for multivariate, there would be.All three would lead to the same shaped
(2, 3)
MvNormal.From @brandonwillard: "in v4, all the broadcasting and shape calculations for the arguments are already worked out so we always know what the shape of a single sample is under the given parameters. size just says that you want size-many more of those samples. as a matter of fact, size can be symbolic in v4!"
The big downside is that we're breaking backwards compatibility in a major way. Shapes are some of the hardest things to get correctly and I'm very concerned that users might shy away from
v4
because of that.For now, @brandonwillard @michaelosthege and I agree that definitely we should start with keeping
shape
for compatibility. The question is whether we should ever remove it completely, which is where I'm having concerns. After talking @junpenglao who also shared these concerns.