pymc-devs / pymc

Bayesian Modeling and Probabilistic Programming in Python
https://docs.pymc.io/
Other
8.75k stars 2.02k forks source link

Implement shape and dims for v4 #4552

Closed twiecki closed 3 years ago

twiecki commented 3 years ago

@brandonwillard is proposing to remove the current shape kwarg and replace it with size which works more similar to the way numpy handles it. The difference is that size only specifies the batch dimension, while the event dimension is taken from the distribution. For univariate distributions, there is no difference, but for multivariate, there would be.

coords=dict(town=ABC, year=[2020, 2021])
X = MvNormal(cov=eye(3), shape=(2, 3))
Y = MvNormal(cov=eye(3), dims=("town", "year"))
Z = MvNormal(cov=eye(3) , size=2)

All three would lead to the same shaped (2, 3) MvNormal.

From @brandonwillard: "in v4, all the broadcasting and shape calculations for the arguments are already worked out so we always know what the shape of a single sample is under the given parameters. size just says that you want size-many more of those samples. as a matter of fact, size can be symbolic in v4!"

The big downside is that we're breaking backwards compatibility in a major way. Shapes are some of the hardest things to get correctly and I'm very concerned that users might shy away from v4 because of that.

For now, @brandonwillard @michaelosthege and I agree that definitely we should start with keeping shape for compatibility. The question is whether we should ever remove it completely, which is where I'm having concerns. After talking @junpenglao who also shared these concerns.

ricardoV94 commented 3 years ago

Would specifying shape just be a redundant way of specifying size (+ some informative message if the shape didn't make sense)?

michaelosthege commented 3 years ago

shape and dims are fully redundant to each other.

shape and size are only directly comparable given the other parameters such as cov=eye(3).

Because shape ≡ dims and I think we all agree that dims was a great improvement of our API, I don't think we can/need/should make the internal logic go away. However, we might still want to nudge everybody away from it.

One idea that circulated in our chat discussion was to slightly change what shape does in order to make it a new feature:

If distribution parameters are symbolic, it's not immediately clear how the RV looks like:

x = MvNormal(cov=cov_prior)

Specifying shape makes the model code easier to read, but makes it very rigid w.r.t. out-of-sample prediction:

x = MvNormal(cov=cov_prior, shape=(3,))   # can no longer be broadcasted into a different size

It's the same as

x = MvNormal(cov=cov_prior)
aesara.tensor.specify_shape(x, (3,))    # this Op needs to become a function output to be effective!

We could implement shape such that the SpecifyShape Op is automatically added, creating a more shape-rigid model. The plus point then is that inputs to specify_shape could even be symbolic, creating asserts such as: "This RVs 2nd dimension must be the same size as that dim from the other RV."

OriolAbril commented 3 years ago

I like the idea of distinguishing batch from event dimensions. This is currently one of the main pain points (for me) with ArviZ converters. The variables in posterior predictive, observed data and log likelihood have the same shape for univariate distributions, but the log likelihood has a subset of the shape for multivariate.

ArviZ currently does a quick and dirty check when retrieving the log likelihood (for all converters, as AFAIK none of the libraries properly supported distinguish on this) which is dropping the last dimensions when a shape mismatch is found to see if this solves things.

The example above would therefore break ArviZ when using custom dims because the event dim is not the last.

coords=dict(town=ABC, year=[2020, 2021])
X = MvNormal(cov=eye(3), shape=(2, 3))
Y = MvNormal(cov=eye(3), dims=("year", "town"))
Z = MvNormal(cov=eye(3) , size=2)

would work. If the variable Y were observed, it would currently error out at the idata conversion step due to the shape specification.

I don't really know if this would help (I guess we can internally make sure the event dims are the last if using size) nor how to make it better, I just wanted to share that as it is a new feature with close to no docs and literally no examples yet. Possible improvements on that for the v4 converter would be great

ricardoV94 commented 3 years ago

Shape was definitely one of those "invisible" features that took me some time to figure out. Looking at questions in the discourse, it seems to crop up frequently. The other that seems to confuse people (although it's just theano borrowing from the numpy API) is fancy indexing:

indexes = np.array([0, 1, 2, 3, 0, 1, 2, 3])
a = pm.Normal('a', 0, 1, shape=4)
like = pm.Normaly('like', a[indexes], 1, observed=data)

The new coords seems even more enigmatic to me. Would be nice to have a doc notebook just walking through the different shape/size and indexing possibilities in pymc3. I know we have some examples, but I think (without proof) that they tend to be difficult to find.

Edit: This was brought up here https://github.com/pymc-devs/pymc3/issues/4007

brandonwillard commented 3 years ago

Shapes are some of the hardest things to get correctly and I'm very concerned that users might shy away from v4 because of that.

In v4 users are no longer required to determine the resulting shapes of the objects they create. In v3 a user is forced to do this themselves, and, as we all know, it's confusing and leads to numerous errors. Those are actual reasons for users to shy away from PyMC. I don't understand how asking a user to do less is off-putting.

Also, I don't see any real advantages to porting the shape keyword—aside from backward compatibility, but, since we're explicitly creating a new version, backward compatibility breaks like this are completely justified.

More importantly, the shape keyword has a few very real disadvantages, and we've been dealing with them for quite a while now (e.g. it's confusing, error prone, limits the flexibility and reuse of our models, etc.), so I'm very much against propagating this key issue into the next version.

As @michaelosthege mentioned, if one wants to constrain the shape of a variable, they can use specify_shape, and we could implement a shape keyword option using that; however, a user that very specifically wants to restrict the shape of a random variable can just as easily apply specify_shape after creating said variable.

The problem I see with advertising a shape keyword in v4 is that v3 users who have learned this bad habit will likely end up doing it in v4, giving rise to all those old issues and limitations. This is also likely to happen when new users attempt to recreate models written for v3.

This shape keyword is both the interface to—and cause of—one of the biggest problems we're fixing in v4, so it's the last thing we want to unnecessarily shoehorn back into it.

twiecki commented 3 years ago

I suppose I would need to be sold with examples where size is superior. If we can market this as a big improvement it's a different story.

On Sat, Mar 20, 2021, 00:07 Brandon T. Willard @.***> wrote:

Shapes are some of the hardest things to get correctly and I'm very concerned that users might shy away from v4 because of that.

In v4 users are no longer required to determine the resulting shapes of the objects they create. In v3 a user is forced to do this themselves, and, as we all know, it's confusing and leads to numerous errors. Those are actual reasons for users to shy away from PyMC. I don't understand how asking a user to do less is off-putting.

Also, I don't see any real advantages to porting the shape keyword—aside from backward compatibility, but, since we're explicitly creating a new version, backward compatibility breaks like this are completely justified.

More importantly, the shape keyword has a few very real disadvantages, and we've been dealing with them for quite a while now (e.g. it's confusing, error prone, limits the flexibility and reuse of our models, etc.), so I'm very much against propagating this key issue into the next version.

As @michaelosthege https://github.com/michaelosthege mentioned, if one wants to constrain the shape of a variable, they can use specify_shape, and we could implement a shape keyword option using that; however, a user that very specifically wants to restrict the shape of a random variable can just as easily apply specify_shape after creating said variable.

The problem I see with advertising a shape keyword in v4 is that v3 users who have learned this bad habit will likely end up doing it in v4, giving rise to all those old issues and limitations. This is also likely to happen when new users attempt to recreate models written for v3.

This shape keyword is both the interface to—and cause of—one of the biggest problems we're fixing in v4, so it's the last thing we want to unnecessarily shoehorn back into it.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/pymc-devs/pymc3/issues/4552#issuecomment-803182883, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFETGEI3VAK6X3WJ7CDODTTEPKMZANCNFSM4ZKXGZTQ .

brandonwillard commented 3 years ago

I suppose I would need to be sold with examples where size is superior.

Here's a very simple example that works in my local #4551 branch of v4:

import numpy as np
import pymc3 as pm

import aesara
import aesara.tensor as aet

data = np.random.normal(size=(1, 3))
X_at = aesara.shared(data)

with pm.Model() as m:
    beta = pm.MvNormal("beta", mu=aet.zeros((X_at.shape[1],)), cov=aet.eye(X_at.shape[1]))
    Y = pm.MvNormal("Y", mu=X_at.dot(beta), cov=aet.eye(X_at.shape[0]), size=(2,))

with m:
    priorp_samples = pm.sample_prior_predictive(samples=10)
>>> priorp_samples["beta"].shape
(10, 3)
>>> priorp_samples["Y"].shape
(10, 2, 1)
new_data = np.random.normal(size=(2, 4))
X_at.set_value(new_data)

with m:
    priorp_samples = pm.sample_prior_predictive(samples=10)
>>> priorp_samples["beta"].shape
(10, 4)
>>> priorp_samples["Y"].shape
(10, 2, 2)

This would not even be remotely possible if the shapes of beta and/or Y were fixed via a shape keyword.

Otherwise, I invite everyone to imagine how this implementation would change if symbolic values were to be used in the shape value. I guarantee that the change wouldn't be an improvement in any way.

twiecki commented 3 years ago

OK that's awesome, I get it now. Shapes are automatically propagated through.

On Sat, Mar 20, 2021, 01:24 Brandon T. Willard @.***> wrote:

I suppose I would need to be sold with examples where size is superior. If

Here's a very simple example that works in my local #4551 https://github.com/pymc-devs/pymc3/pull/4551 branch of v4:

import numpy as npimport pymc3 as pm import aesaraimport aesara.tensor as aet

data = np.random.normal(size=(1, 3))X_at = aesara.shared(data)

with pm.Model() as m: beta = pm.MvNormal("beta", mu=aet.zeros((X_at.shape[1],)), cov=aet.eye(X_at.shape[1])) Y = pm.MvNormal("Y", mu=X_at.dot(beta), cov=aet.eye(X_at.shape[0]), size=(2,)) with m: priorp_samples = pm.sample_prior_predictive(samples=10)

priorp_samples["beta"].shape (10, 3)>>> priorp_samples["Y"].shape (10, 2, 2)

new_data = np.random.normal(size=(2, 4))X_at.set_value(new_data) with m: priorp_samples = pm.sample_prior_predictive(samples=10)

priorp_samples["beta"].shape (10, 4)>>> priorp_samples["Y"].shape (10, 2, 2)

This would not even be remotely possible if the shapes of beta and/or Y were fixed via a shape keyword.

Otherwise, I invite everyone to imagine how this implementation would change if symbolic values were to be used in the shape value. I guarantee that the change wouldn't be an improvement in any way.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/pymc-devs/pymc3/issues/4552#issuecomment-803205685, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFETGCBMNK2SSAX33TAAKDTEPTNFANCNFSM4ZKXGZTQ .

michaelosthege commented 3 years ago

The fact that this use case is now possible is awesome! At the same time it's not something we need from every model.

In 95 % of the models I write, I'm more concerned with the inference on the data I have, not the out-of-sample prediction using shared variables. For inference the dims kwarg is incredibly useful. It just happens that it's quite similar to shape. Our discussion should be titled "dims and size" not "shape vs size".

brandonwillard commented 3 years ago

In 95 % of the models I write, I'm more concerned with the inference on the data I have, not the out-of-sample prediction using shared variables.

Don't forget that, because we can do this shared variable thing, it also means that people who run the same model on many different datasets will no longer need to recreate their Models and recompile their respective log-likelihood graphs to C/whatever else.

michaelosthege commented 3 years ago

Don't forget that, because we can do this shared variable thing, it also means that people who run the same model on many different datasets will no longer need to recreate their Models and recompile their respective log-likelihood graphs to C/whatever else.

Sure, that's great. Let's figure out how to make dims work with that.

  1. The values passed via dims are already sort of "symbolic". They are placeholders. Can we translate them into symbolic elements for size?
  2. What about those dims that refer to implied dimensions? (That's actually very common. e.g. https://github.com/rtcovidlive/rtlive-global/blob/master/rtlive/model.py#L168-L176)
  3. What happens to coords when the value of a shared variable is changed? Can we pass new values for the coords via something like pm.Data().set_value(val, coords=...)?
OriolAbril commented 3 years ago

From a naive/conceptual point of view, I see dims fitting better the size approach. Currently dims takes the info of coords to get the length of the dimension, but I actually don't think this is ideal.

Whenever I have to do cross-validation, I often use shape to define the model and then when converting to inferencedata, I use dims but no coords. I really don't care about coordinate values, and having to add extra code to sort coord values at model creation is work I don't really need and won't use, whereas having the correct dims in place helps with automatic alignment and broadcasting when postprocessing with xarray.

It would be great to pair named dimensions with the size argument and then define the coordinate values at sampling/conversion time which is what feels more natural. In this crossvalidation/out of sample sampling, the dimension id generally is part of the model and I think it would be useful to allow having this info still there like we do now with dims. The coordinate values however, are not part of the model and I think they should be better off not being tied to a model but to data/sampling realizations

twiecki commented 3 years ago

The benefit I see is the shape propagation, not necessarily that we can switch things for prediction. With shape or dim I always have to specify the full shape of every tensor, while with size I just have to remember the dimension I want to batch over, which is much simpler. I think dims could be changed in this way where I'm just specifying which variable I want to batch over. I changed the example below: we have two experiments/replications on a drug with different dose levels administered to multiple subjects.


import numpy as np
import pymc3 as pm

import aesara
import aesara.tensor as aet

n_doses = 3
n_subjects = 5

data = np.random.normal(size=(n_doses, n_subjects))
X_at = aesara.shared(data)

with pm.Model() as m:
    beta = pm.MvNormal(
        "beta", 
        mu=aet.zeros((n_doses,)), 
        cov=aet.eye(n_doses))

    Y = pm.MvNormal(
        "Y", 
        mu=X_at.dot(beta), 
        cov=aet.eye(n_states), 
        dims=("experiments"))

Without this, the likelihood would have to look like:

    Y = pm.MvNormal(
        "Y", 
        mu=X_at.dot(beta), 
        cov=aet.eye(n_states), 
        dims=("experiments", "doses", "subjects"))

Maybe we want to rename dim in the above case to plate_dim, batch_dim, or rep_dim to avoid confusion.

michaelosthege commented 3 years ago

I don't see how this reduces confusion. On the contrary, I find plate_dim/batch_dim/rep_dim and yes, also size more confusing than either shape or dim.

There are tons of cases where you need to know the actual shape or at least the order in which the dims appear:

...just to name a few.

This whole shape propagation/broadcasting is nice if you need it and if you know how it works.

For the 90 % other situations it's immensely important that model code is expressive and doesn't require you to evaluate in-line mathematical expressions such as X_at.dot(beta) to figure out what the dims of a variable are.

Without this, the likelihood would have to look like:

   Y = pm.MvNormal(
       "Y", 
       mu=X_at.dot(beta), 
       cov=aet.eye(n_states), 
       dims=("experiments", "doses", "subjects"))

I think this RV is waay easier to work with. This way I won't be surprised that idata.posterior.Y.ndim == 3 and I can select idata.posterior.Y.sel(doses=5) and so on.

dims and coords are the most useful thing we added in recent releases. Let's not vandalize that in favor of edge cases.

fonnesbeck commented 3 years ago

Having shape and size in addition to dims is incredibly confusing. I can see the argument for one or the other, but not both. If we move to size then shape ought to be deprecated and removed.

brandonwillard commented 3 years ago

There are tons of cases where you need to know the actual shape or at least the order in which the dims appear:

It sounds like you're confusing shape with dimensions. In Theano/Aesara the number and (operation-induced) order of dimensions cannot be symbolic, so, when it comes to dimensions nothing changes. If one wants to provide labels for the dimensions, they can; the use of a size or shape parameter has no effect on this.

  • Operations involving multiple n-dimensional tensors (*, +, dot, ...)

These operations do not require knowledge of the objects' exact shapes, and this is evidenced by the example I gave above, which uses these exact operations (both implicitly and explicitly).

  • Plotting a plate-model (pm.model_to_graphviz)

Anything that involves plotting is something that exists outside of the PPL, and—thus—outside of the scope/relevance of symbolic shape considerations. In other words, if one is at the point of plotting something, then they necessarily have a trace with concrete non-symbolic shapes, so this shape vs. size subject is no longer relevant.

  • Creating and working with traces/InferenceData

Again, if one has a trace, then all the shapes are already worked out and known, so this discussion isn't relevant.

This whole shape propagation/broadcasting is nice if you need it and if you know how it works.

The point of the change from shape to size is that a user doesn't need to know how any of that works, so we're not going to demand that users specify the entire shape. Really, it's the shape parameter that forces users to know how shape propagation and broadcasting works, because that's exactly what they need to do in order to come up with a valid shape value.

Worse yet, the shape propagation and broadcasting is always needed and used. The difference is that under size we use it to the user's advantage, and under shape we don't (while also forcing models to be unnecessarily shape-specific).

For the 90 % other situations it's immensely important that model code is expressive and doesn't require you to evaluate in-line mathematical expressions such as X_at.dot(beta) to figure out what the dims of a variable are.

What is this 90%? Why would one need to evaluate an expression to figure out the dimensions? If one can specify an exact shape—for the shape parameter—in the first place, wouldn't they have all the information they need about these dimensions? Are you implying that one would need to evaluate something in order to use a size parameter?

Also, there's nothing expressive about the use of a shape parameter. At best, it's simply more explicit, but it's also unnecessary, overly verbose, and objectively restrictive and error prone. In contrast to your statements above, the shape parameter has alway been the source of many errors, because it forces the users to know and perform the shape propagation and broadcasting themselves.

I think this RV is waay easier to work with. This way I won't be surprised that idata.posterior.Y.ndim == 3 and I can select idata.posterior.Y.sel(doses=5) and so on.

dims and coords are the most useful thing we added in recent releases. Let's not vandalize that in favor of edge cases.

Once again, these statements are conflating the DSL/PPL aspects of PyMC3 with the data it produces. idata is a trace containing samples from a model that was already compiled, so it has nothing to do with size vs. shape parameters.

If one wants to add dimension names to a random variable, that can be done at any point, and is not affected by manually vs. automatically computed shapes. As I stated before, the number of dimensions cannot be changed in Theano/Aesara; the only thing that is automatically computed is the length/size of each dimension. If one can specify a valid shape parameter, then they already know how many dimensions there are and what each dimension is.

At a point, arguing for this shape parameter is the same as saying "Users shouldn't specify log-likelihood graphs; they should provide all the log-likelihood values".

michaelosthege commented 3 years ago

We have obviously very different opinions about shape, size and dims.

I am personally not particularly attached to shape, but I believe that making the v3v4 transition as smooth as possible should be our topmost priority. The faster users adopt v4, the better for the entire project. We can continue to nudge them towards different APIs throughout v4.x versions.

What is this 90%? Why would one need to evaluate an expression to figure out the dimensions? If one can specify an exact shape—for the shape parameter—in the first place, wouldn't they have all the information they need about these dimensions?

Knowing the dims of a model variable while writing the model is one thing, but days or weeks later, or when looking at someone's model code, nobody wants to evaluate the chain of expressions in their head before they can tell the dims of a model variable! → Passing shape= is the old, dumb way of dealing with this problem. → Writing comments is the easy way, but we all know that code & comments diverge over time → Adding SpecifyShape Ops is overly verbose and unnecessarily restricts the model → Passing dims= is the more flexible and elegant way to do it.

The key point being: Verbosity is often a good thing! Verbosity should not be required, but it should seamlessly integrate with the API.

Now here's maybe a way out of this dilemma. What if we start using Ellipsis to substitute implied dimensions:

pm.MvNormal(mu=[1,2,3], cov=np.eye(3), shape=(2, 3))
# is the same as
pm.MvNormal(mu=[1,2,3], cov=np.eye(3), shape=(2, ...))
# is the same as
pm.MvNormal(mu=[1,2,3], cov=np.eye(3), size=2)

The same with dims:

coords = {
    year=[2020, 2021],
    drink=["Water", "Orange Juice", "Apfelschorle"],
}
pm.MvNormal(mu=[1,2,3], cov=np.eye(3), dims=("year", "drink"))   # explicit, verbose and shape-flexible!
# is the same as
pm.MvNormal(mu=[1,2,3], cov=np.eye(3), dims=("year", ...))       # implicit and ndim-flexible
brandonwillard commented 3 years ago

Now here's maybe a way out of this dilemma. What if we start using Ellipsis to substitute implied dimensions:

If that helps, we could do it.

Remember, we can also strip those "base" dimensions from a user-specified shape, since we'll be able to tell how many dimensions are implied by the distribution parameters. In effect, we would be extracting the size parameter from a shape and ignoring the redundant/unused parts. Naturally, we would need to emit a warning, telling the user that we've discarded the rest.

OriolAbril commented 3 years ago

The faster users adopt v4, the better for the entire project. We can continue to nudge them towards different APIs throughout v4.x versions.

I agree with the first point, but not so much with the second. We have an opportunity to make a clean slate and break virtually anything, and I think we should make the most of of this for two reasons. First is that it is less work for us, starting the new major version with multiple alternatives of doing things and having to handle deprecations from the very beginning seems like extra and unnecessary work to me. Second is that I think users will rather have things break once and then last for multiple versions than having to be constantly paying attention to the deprecations and api changes between minor versions.

The key point being: Verbosity is often a good thing! Verbosity should not be required, but it should seamlessly integrate with the API.

This would be amazing, I like the ellipsis option with dims, however I don't think it would "keep" the dimensions from the variable who is actually defining them when broadcasting, so we'll have to take that into account when converting to ArviZ, otherwise arviz will see arrays with n dims and only m names for its dims and error out.

michaelosthege commented 3 years ago

We have an opportunity to make a clean slate and break virtually anything, and I think we should make the most of of this for two reasons.

You're right. What I meant is more along the lines of doing warn("This API will stop working in 4.1. Do this instead: ...") or even raise with the same message. For many of these (new) APIs I don't trust us hitting the nail on the head in the first try. With warn("...will stop working in 4.1...") we can still roll back our decision if we need to.

[...] "keep" the dimensions from the variable who is actually defining them when broadcasting [...]

Yes, with explicit dims=("year", "drink") that's easier. We'll need some logic to figure out the dim names from the RV inputs, so we can autocomplete the .... I guess in model.add_random_variable we should the Ellipses with implied dims and then assert var.ndim == len(dims).

One option (also looking at #4565) could be to add a dims property to the RandomVariable (or even deeper in Aesara?) that is either supplied upon instantiation, or inferred if accessed (kinda like infer_shape). Then we wouldn't have to deal with named dims in the PyMC3 model at all.

OriolAbril commented 3 years ago

This could (with the could being quite the supposition) be solved by using xarray as a backend, otherwise I fear we will end up making xarray from scratch on aesara. cc @ColCarroll @canyon289 as they are listed as possible mentors on that project and may have better ideas on this end.

If Aesara were a dynamic computation library, it would be quite easy to get xarray to work with it, xarray doesn't really do any computation, only handles dims, coords and aligns the underlying arrays, everything else works as long as the underlying data structure implements the __numpy_func__ and __numpy_ufunc__.

For static computation I think it won't be as straighforward, but it could be possible anyway, a la having .compute() like we do with dask xarray ojbects call .eval() in Aesara or something like that.

References

junpenglao commented 3 years ago

Seems we all agree that size is better and should ultimately deprecate shape - I think @brandonwillard's suggestion is the least disruptive one for user, and we can phase out shape gently:

Remember, we can also strip those "base" dimensions from a user-specified shape, since we'll be able to tell how many dimensions are implied by the distribution parameters. In effect, we would be extracting the size parameter from a shape and ignoring the redundant/unused parts. Naturally, we would need to emit a warning, telling the user that we've discarded the rest.

junpenglao commented 3 years ago

Re size naming, I am extremely torn between "Explicit is better than implicit" (we should name it like batch_shape or plate_shape) and "Simple is better than complex" (size is simple and consistent with numpy).

Right now I am 51% prefer size

twiecki commented 3 years ago

I don't think there is agreement here at all that size is better than shape. @aseyboldt, Michael and I prefer the explicit shape.

Personally I like the approach with Ellipses best where we can have optional shape propagation but the default stays explicit, backwards compatible, and compatible with dims.

On Sat, Mar 27, 2021, 08:38 Junpeng Lao @.***> wrote:

Re size naming, I am extremely torn between "Explicit is better than implicit" (we should name it like batch_shape or plate_shape) and "Simple is better than complex" (size is simple and consistent with numpy)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/pymc-devs/pymc3/issues/4552#issuecomment-808680855, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFETGAU32FATQCDIXGK6LLTFWDPHANCNFSM4ZKXGZTQ .

brandonwillard commented 3 years ago

I hope it's clear to everyone that the ellipses approach is just a longer version of size (e.g. shape=(2, 3, Ellipsis) is equivalent to size=(2, 3))

twiecki commented 3 years ago

Yes, that's what I like about it. In most cases I think users will find shape/dims easier. Advanced users can use Ellipses. That way we maintain backwards compatibility but have a cool new feature rather than explaining users a subtle difference in how we now handle shapes.

On Sat, Mar 27, 2021, 09:58 Brandon T. Willard @.***> wrote:

I hope it's clear to everyone that the ellipses approach is just a longer version of size (e.g. shape=(2, 3, Ellipsis) is equivalent to size=(2, 3))

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/pymc-devs/pymc3/issues/4552#issuecomment-808695246, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFETGBLDQORQNJ7QS7PNLTTFWM3XANCNFSM4ZKXGZTQ .

twiecki commented 3 years ago

This is what we agreed on after discussing our last lab meeting:

OriolAbril commented 3 years ago

We will implement shape and dims in a backwards compatible way in v4

I thought that was dims only :thinking: I think shape is much more limiting than dims and having both size and shape will be even more confusing than breaking some code and having users update to size or dims.

twiecki commented 3 years ago

Don't we translate dims to shape internally?

OriolAbril commented 3 years ago

I think I got lost during the meeting, nevermind. I am probably too xarray biased for my own good.

So to clarify:

  1. size will not be available to users, only shape and to use size one should use shape=(2, 3, Ellipsis).
  2. dims will convert to shape then to size?
  3. will we discourage the use of shape without ellipsis?
  4. Will dims continue to rely on coords defined at model creation? Will changing such coords afterwards (even after having already sampled) revert back to the actual shapes of the variables in the model?
  5. Can someone suggest what do they expect ArviZ to do when ellipsis are present?
twiecki commented 3 years ago

I think I got lost during the meeting, nevermind. I am probably too xarray biased for my own good.

Here's how I see it.

So to clarify:

1. `size` will not be available to users, only `shape` and to use size one should use `shape=(2, 3, Ellipsis)`.

I think that's the best option, just to remove confusion and too many options for doing the same thing, although we'd probably be artificially removing that capability. It's a good question.

2. `dims` will convert to `shape` then to `size`?

I think yes.

3. will we discourage the use of `shape` without ellipsis?

No, Ellipses is an advanced feature for those who know it and need it. We mostly think that explicit shape specification is better.

4. Will dims continue to rely on `coords` defined at model creation? Will changing such coords afterwards (even after having already sampled) revert back to the actual shapes of the variables in the model?

I assume that we currently look the shapes up during model definition, right? Ideally we can make that dynamic then but not sure how easy that would be. Ideally we move the whole names business into aesara eventually: https://github.com/pymc-devs/aesara/issues/352

5. Can someone suggest what do they expect ArviZ to do when ellipsis are present?

That could be tricky, ideally we somehow infer this during model spec but not sure how easy that would be. I'm not sure how it can be solved :-/.

michaelosthege commented 3 years ago

closed by #4696