ragged array assignment (not declaration)

alashworth / test-issue-import

0 stars 0 forks source link

ragged array assignment (not declaration) #65

Open alashworth opened 5 years ago

alashworth commented 5 years ago

Issue by bob-carpenter Saturday Mar 14, 2015 at 03:59 GMT Originally opened as https://github.com/stan-dev/stan/issues/1369

From Ben Goodrich on stan-dev:

OK. I invented a new rule (locally): You are allowed to assign anything to a matrix with 0 rows and 0 columns. So, this now works:

model {
  matrix[0,0] X[2];
  vector[4] ones;
  ones[1] <- 1;
  ones[2] <- 1;
  ones[3] <- 1;
  ones[4] <- 1;
  X[1] <- diag_matrix(ones);
  theta ~ normal(0,1);
}

But it would probably be better to support (outside of the data and parameters blocks)

matrix X[2]; 
// parse to vector<Eigen::Matrix<T__,Eigen::Dynamic,Eigen::Dynamic> > X(2);
// i.e. no fill value after the 2

rather than making users declare

matrix[0,0] X[2];

I think we're good in that case in the sense that this runs so I guess the column vector gets promoted:

#include <vector>
#include <Eigen/Dense>

int main() {
  std::vector<Eigen::Matrixd> > x(2);
  Eigen::MatrixXd m(2,2);
  m(0,0) = 3;
  m(1,0) = 2.5;
  m(0,1) = -1;
  m(1,1) = m(1,0) + m(0,1);
  x[0] = m;
  x[1] = m.col(1);
  return 0;
}

However, the comment in src/stan/math/prim/mat/fun/assign.hpp does not seem to match what the code does. I had to do trial and error to figure out that this was being called rather than the one above it that allegedly fires when the dimensions do not match.

/** 
 * Copy the right-hand side's value to the left-hand side 
 * variable. 
 * 
 * The <code>assign()</code> function is overloaded.  This 
 * instance will be called for arguments that are both 
 * <code>Eigen::Matrix</code> types and whose shapes match.  The 
 * shapes are specified in the row and column template parameters. 
 * 
 * @tparam LHS Type of left-hand side matrix elements. 
 * @tparam RHS Type of right-hand side matrix elements. 
 * @tparam R Row shape of both matrices. 
 * @tparam C Column shape of both mtarices. 
 * @param x Left-hand side matrix. 
 * @param y Right-hand side matrix. 
 * @throw std::invalid_argument if sizes do not match. 
 */ 
template <typename LHS, typename RHS, int R, int C> 
inline void 
assign(Eigen::Matrix<LHS,R,C>& x, 
       const Eigen::Matrix<RHS,R,C>& y) { 
  if(x.rows() == 0 && x.cols() == 0) { 
    x = y; 
    return ; 
  } 
  stan::math::check_matching_dims("assign", 
                                            "x", x, 
                                            "y", y); 
  for (int i = 0; i < x.size(); ++i) 
    assign(x(i),y(i)); 
}

alashworth commented 5 years ago

Comment by bgoodri Monday Mar 16, 2015 at 18:03 GMT

Okay, it seems that

This instance will be called for arguments that are both Eigen::Matrix types and whose shapes match.

is a correct code comment for what the compiler interprets as a "match"; i.e. Eigen::Matrix<T__,Eigen::Dynamic,Eigen::Dynamic> matches with another Eigen::Matrix<T__,Eigen::Dynamic,Eigen::Dynamic> even if their rows and / or columns are different numbers. I think my hack of allowing resizing of 0x0 matrices also needs type promotion, but other than that I guess we only need the declaration change and I can PR a bunch more matrix decomposition stuff.

alashworth commented 5 years ago

Comment by bob-carpenter Monday Mar 16, 2015 at 23:12 GMT

I'm still not convinced this is a good idea. I think it's going to be confusing to users. I'm torn between wanting to put in something useful versus holding off until we can implement a proper solution. This is the same way I feel about adding a single sparse matrix operation or very specific GLM functions.

Is the plan to throw out size matches altogether in Stan's assign and just allow Eigen-style assignment? Or to just allow that if the lvalue is size 0? I'd be more comfortable with the latter as it seems less error prone.

When you say "compiler", do you mean Stan's or C++'s? Eigen doesn't do size matching. Stan's parser does not look at size at all --- it just checks shape (e.g., matrix vs. vector, 1D array vs. scalar, etc.)

On Mar 17, 2015, at 5:03 AM, bgoodri notifications@github.com wrote:

Okay, it seems that

This instance will be called for arguments that are both Eigen::Matrix types and whose shapes match.

is a correct code comment for what the compiler interprets as a "match"; i.e. Eigen::Matrix<T,Eigen::Dynamic,Eigen::Dynamic> matches with another Eigen::Matrix<T,Eigen::Dynamic,Eigen::Dynamic> even if their rows and / or columns are different numbers. I think my hack of allowing resizing of 0x0 matrices also needs type promotion, but other than that I guess we only need the declaration change and I can PR a bunch more matrix decomposition stuff.

— Reply to this email directly or view it on GitHub.

alashworth commented 5 years ago

Comment by bgoodri Monday Mar 16, 2015 at 23:28 GMT

I think having functions are able to return an array of matrices would be a good thing, and I am not seeing another way to accomplish that right now. Possibly we could make a push to allow simultaneous declaration and definitions, in which case things wouldn't really be "resized" but just given different assignments from the outset. But we would still need heterogenous sizes.

If we do resizing, it probably makes sense to only allow Eigen-style assignment if the lvalue has size 0 but it is I think more confusing to the user to have to write

matrix[0,0] Q_and_R[2];
Q_and_R <- QR(Sigma);

than to write

matrix Q_and_R[2];
Q_and_R <- QR(Sigma);

although

matrix Q_and_R[2] <- QR(Sigma);

would probably be better eventually. Although maybe when we go to C++, stanc could just use auto behind the scenes and then it could just be

Q_and_R <- QR(Sigma); // parses as auto Q_and_R = QR(Sigma);

The "compiler" thing that threw me was that the code comment was refering to Eigen's matching on Eigen::Dynamic whereas I was thinking in terms of algebra.

alashworth commented 5 years ago

Comment by bob-carpenter Monday Mar 16, 2015 at 23:44 GMT

On Mar 17, 2015, at 10:28 AM, bgoodri notifications@github.com wrote:

I think having functions are able to return an array of matrices would be a good thing, and I am not seeing another way to accomplish that right now. Possibly we could make a push to allow simultaneous declaration and definitions,

That's on my near-term to-do list, but as you point out, it won't solve the heterogeneous size problem.

in which case things wouldn't really be "resized" but just given different assignments from the outset. But we would still need heterogenous sizes.

Maybe we need an R list-like structure. The only problem is how to declare such a thing and still allow type inference.

If we do resizing, it probably makes sense to only allow Eigen-style assignment if the lvalue has size 0 but it is I think more confusing to the user to have to write

matrix[0,0] Q_and_R[2]; Q_R <- QR(Sigma);

than to write

matrix Q_and_R[2]; Q_and_R <- QR(Sigma);

Agreed. So that would involve a change in the language parser and code generator. A simple way to deal with it would be to take matrix as synonymous to "matrix[0,0]".

The user-defined function definitions also use bare types, so there's already doc on what they look like.

We have to think in terms of generality here, so a user could write

matrix a[3];

to get an array of matrix[0,0], each of which would then be assignable to a different sized matrix.

We then have to think through what this does to things like the num_elements and size functions.

although

matrix Q_and_R[2] <- QR(Sigma);

would probably be better eventually. Although maybe when we go to C++, stanc could just use auto behind the scenes and then it could just be

Q_and_R <- QR(Sigma); // parses as auto Q_and_R = QR(Sigma);

I don't like that idea because I want to preserve static typing, as in C++, and not move to Python or R-style dynamic typing.

The "compiler" thing that threw me was that the code comment was refering to Eigen's matching on Eigen::Dynamic whereas I was thinking in terms of algebra.

Ah, right. You can use fixed sizes > 1, but they get implemented as Dynamic at some small N, but tested statically for matching. I decided not to do that in Stan because I thought it'd hugely complicate argument passing to functions (built-ins, because I wasn't thinking user-defined at the time).

alashworth commented 5 years ago

Comment by bgoodri Monday Mar 16, 2015 at 23:51 GMT

The closest thing to an R list is a std::tuple but then we would have to wait for C++11. I think if someone does

matrix Q_and_R[2];
print(size(Q_and_R)); // should still be 2
print(num_elements(Q_and_R)); // should still be 0
Q_and_R <- QR(Sigma);
print(num_elements(Q_and_R)); // should now be equal to rows(Sigma) * cols(Sigma)^3

alashworth commented 5 years ago

Comment by bob-carpenter Tuesday Mar 17, 2015 at 00:02 GMT

Boost has already implemented tuples. They tend to anticipate the new C++ work, since many of their package authors are involved.

The problem I see is that if we have a list of size 100, we'd need a size 100 typelist, which would just crush a compiler.

Small tuples would be OK, though.

On Mar 17, 2015, at 10:51 AM, bgoodri notifications@github.com wrote:

The closest thing to an R list is a std::tuple but then we would have to wait for C++11. I think if someone does

matrix Q_and_R[2]; print(size(Q_and_R)); // should still be 2 print(num_elements(Q_and_R)); // should still be 0 Q_and_R <- QR(Sigma); print(num_elements(Q_and_R)); // should now be equal to rows(Sigma) * cols(Sigma)^3

— Reply to this email directly or view it on GitHub.

alashworth commented 5 years ago

Comment by bgoodri Tuesday Mar 17, 2015 at 00:14 GMT

I think tuples would be useful eventually (and hopefully the C++11 version is easier to compile), but they are more general than what is needed in this case where the heterogenous dynamic-sized matrices are "homogenous" from Eigen's perspective. If

matrix Q_and_R[2];

is equivalent to

matrix[0,0] Q_and_R[2];

it would work like user-defined functions and not disorient users much. I guess the danger is that someone acceidentally declares a 0x0 matrix and then assigns some other matrix to it like

matrix[K,K] foo; // where K = 0 due to a bug
foo <- bar; // where bar is not 0x0

But that might actually be the intended behavior and the user just meant for foo to me JxJ or something.

alashworth commented 5 years ago

Comment by bgoodri Wednesday Mar 18, 2015 at 19:13 GMT

What is the right Stan syntax to do promotion inside here:

  if(x.rows() == 0 && x.cols() == 0) { 
    x = y;  // something like x = promote<scalar_type(x)>(y);
    return ; 
  }

I need to implement this locally for a conference paper.

alashworth commented 5 years ago

Comment by bob-carpenter Wednesday Mar 18, 2015 at 23:23 GMT

Is this in a user-defined function in Stan? It should take care of whatever promotion you need automatically.

For instance, if you write this

real foo(real x, real y) { return x + y; }

and you pass in data (double) x and parameter (var) y, the result will automatically be a var. And if x and y are data and you assign to a local variable in the model or transformed parameters block, you'll also get the promotion for free.

On Mar 19, 2015, at 6:13 AM, bgoodri notifications@github.com wrote:

What is the right Stan syntax to do promotion inside here:

if(x.rows() == 0 && x.cols() == 0) { x = y; // something like x = promote<scalar_type(x)>(y); return ; }

I need to implement this locally for a conference paper.

— Reply to this email directly or view it on GitHub.

alashworth commented 5 years ago

Comment by bgoodri Wednesday Mar 18, 2015 at 23:35 GMT

No, I meant in assign.hpp to allow .stan programs with std::vectors of Eigen matrices with different sizes:

/** 
 * Copy the right-hand side's value to the left-hand side 
 * variable. 
 * 
 * The <code>assign()</code> function is overloaded.  This 
 * instance will be called for arguments that are both 
 * <code>Eigen::Matrix</code> types and whose shapes match.  The 
 * shapes are specified in the row and column template parameters. 
 * 
 * @tparam LHS Type of left-hand side matrix elements. 
 * @tparam RHS Type of right-hand side matrix elements. 
 * @tparam R Row shape of both matrices. 
 * @tparam C Column shape of both mtarices. 
 * @param x Left-hand side matrix. 
 * @param y Right-hand side matrix. 
 * @throw std::invalid_argument if sizes do not match. 
 */ 
template <typename LHS, typename RHS, int R, int C> 
inline void 
assign(Eigen::Matrix<LHS,R,C>& x, 
       const Eigen::Matrix<RHS,R,C>& y) { 
  if(x.rows() == 0 && x.cols() == 0) { 
    x = y; 
    return ; 
  } 
  stan::math::check_matching_dims("assign", 
                                            "x", x, 
                                            "y", y); 
  for (int i = 0; i < x.size(); ++i) 
    assign(x(i),y(i)); 
}

alashworth commented 5 years ago

Comment by bob-carpenter Wednesday Mar 18, 2015 at 23:56 GMT

Before going ahead with this, I'd like some feedback from everyone else that they think it's a good idea to allow:

matrix[2,3] y; matrix[0,0] x; x <- y;

matrix[2,3] y; matrix x; x <- y;

I'm less happy with the former, but if we allow the latter, we're pretty much going to have to allow the former.

On Mar 19, 2015, at 10:35 AM, bgoodri notifications@github.com wrote:

No, I meant in assign.hpp to allow .stan programs with std::vectors of Eigen matrices with different sizes:

/\

Copy the right-hand side's value to the left-hand side

variable.

The assign() function is overloaded. This

instance will be called for arguments that are both

Eigen::Matrix types and whose shapes match. The

shapes are specified in the row and column template parameters.

@tparam LHS Type of left-hand side matrix elements.

@tparam RHS Type of right-hand side matrix elements.

@tparam R Row shape of both matrices.

@tparam C Column shape of both mtarices.

@param x Left-hand side matrix.

@param y Right-hand side matrix.

@throw std::invalid_argument if sizes do not match. */ template <typename LHS, typename RHS, int R, int C> inline void assign(Eigen::Matrix<LHS,R,C>& x, const Eigen::Matrix<RHS,R,C>& y) { if(x.rows() == 0 && x.cols() == 0) { x = y; return ; } stan::math::check_matching_dims("assign", "x", x, "y", y); for (int i = 0; i < x.size(); ++i) assign(x(i),y(i)); }

— Reply to this email directly or view it on GitHub.

alashworth commented 5 years ago

Comment by bob-carpenter Thursday Mar 19, 2015 at 00:00 GMT

Do you need to do anything other than change check_matching_dims() to allow size zero matrices or vectors on the LHS to be reassigned?

This is all going to be a bit confusing because we'll allow arrays of vectors/matrices of different sizes, but not arrays of arrays of different sizes.

On Mar 19, 2015, at 10:35 AM, bgoodri notifications@github.com wrote:

No, I meant in assign.hpp to allow .stan programs with std::vectors of Eigen matrices with different sizes:

/\

Copy the right-hand side's value to the left-hand side

variable.

The assign() function is overloaded. This

instance will be called for arguments that are both

Eigen::Matrix types and whose shapes match. The

shapes are specified in the row and column template parameters.

@tparam LHS Type of left-hand side matrix elements.

@tparam RHS Type of right-hand side matrix elements.

@tparam R Row shape of both matrices.

@tparam C Column shape of both mtarices.

@param x Left-hand side matrix.

@param y Right-hand side matrix.

@throw std::invalid_argument if sizes do not match. */ template <typename LHS, typename RHS, int R, int C> inline void assign(Eigen::Matrix<LHS,R,C>& x, const Eigen::Matrix<RHS,R,C>& y) { if(x.rows() == 0 && x.cols() == 0) { x = y; return ; } stan::math::check_matching_dims("assign", "x", x, "y", y); for (int i = 0; i < x.size(); ++i) assign(x(i),y(i)); }

— Reply to this email directly or view it on GitHub.

alashworth commented 5 years ago

Comment by bgoodri Thursday Mar 19, 2015 at 00:08 GMT

Maybe. But check_matching_dims is called in several places besides assign.hpp:

goodrich@T540p:/opt/stan$ git grep -F -n "check_matching_dims" src/stan | cat src/stan/math/fwd/mat/fun/columns_dot_product.hpp:7:#include <stan/math/prim/mat/err/check_matching_dims.hpp> src/stan/math/fwd/mat/fun/columns_dot_product.hpp:20: stan::math::check_matching_dims("columns_dot_product", src/stan/math/fwd/mat/fun/columns_dot_product.hpp:37: stan::math::check_matching_dims("columns_dot_product", src/stan/math/fwd/mat/fun/columns_dot_product.hpp:54: stan::math::check_matching_dims("columns_dot_product", src/stan/math/fwd/mat/fun/rows_dot_product.hpp:7:#include <stan/math/prim/mat/err/check_matching_dims.hpp> src/stan/math/fwd/mat/fun/rows_dot_product.hpp:21: stan::math::check_matching_dims("rows_dot_product", src/stan/math/fwd/mat/fun/rows_dot_product.hpp:38: stan::math::check_matching_dims("rows_dot_product", src/stan/math/fwd/mat/fun/rows_dot_product.hpp:55: stan::math::check_matching_dims("rows_dot_product", src/stan/math/fwd/mat/fun/to_fvar.hpp:6:#include <stan/math/prim/mat/err/check_matching_dims.hpp> src/stan/math/fwd/mat/fun/to_fvar.hpp:51: stan::math::check_matching_dims("to_fvar", src/stan/math/prim/mat.hpp:18:#include <stan/math/prim/mat/err/check_matching_dims.hpp> src/stan/math/prim/mat/err/check_matching_dims.hpp:37: inline bool check_matching_dims(const char* function, src/stan/math/prim/mat/fun/add.hpp:6:#include <stan/math/prim/mat/err/check_matching_dims.hpp> src/stan/math/prim/mat/fun/add.hpp:29: stan::math::check_matching_dims("add", src/stan/math/prim/mat/fun/assign.hpp:11:#include <stan/math/prim/mat/err/check_matching_dims.hpp> src/stan/math/prim/mat/fun/assign.hpp:98: stan::math::check_matching_dims("assign", src/stan/math/prim/mat/fun/elt_divide.hpp:6:#include <stan/math/prim/mat/err/check_matching_dims.hpp> src/stan/math/prim/mat/fun/elt_divide.hpp:26: stan::math::check_matching_dims("elt_divide", src/stan/math/prim/mat/fun/elt_multiply.hpp:6:#include <stan/math/prim/mat/err/check_matching_dims.hpp> src/stan/math/prim/mat/fun/elt_multiply.hpp:27: stan::math::check_matching_dims("elt_multiply", src/stan/math/prim/mat/fun/subtract.hpp:6:#include <stan/math/prim/mat/err/check_matching_dims.hpp> src/stan/math/prim/mat/fun/subtract.hpp:29: stan::math::check_matching_dims("subtract",

and I didn't want to create problems if 0x0 matrices are passed to any of those.

On Wed, Mar 18, 2015 at 8:00 PM, Bob Carpenter notifications@github.com wrote:

Do you need to do anything other than change check_matching_dims() to allow size zero matrices or vectors on the LHS to be reassigned?

This is all going to be a bit confusing because we'll allow arrays of vectors/matrices of different sizes, but not arrays of arrays of different sizes.

Bob

On Mar 19, 2015, at 10:35 AM, bgoodri notifications@github.com wrote:

No, I meant in assign.hpp to allow .stan programs with std::vectors of Eigen matrices with different sizes:

/**

Copy the right-hand side's value to the left-hand side

variable. *

The assign() function is overloaded. This

instance will be called for arguments that are both

Eigen::Matrix types and whose shapes match. The

shapes are specified in the row and column template parameters. *

@tparam LHS Type of left-hand side matrix elements.

@tparam RHS Type of right-hand side matrix elements.

@tparam R Row shape of both matrices.

@tparam C Column shape of both mtarices.

@param x Left-hand side matrix.

@param y Right-hand side matrix.

@throw std::invalid_argument if sizes do not match. */ template <typename LHS, typename RHS, int R, int C> inline void assign(Eigen::Matrix<LHS,R,C>& x, const Eigen::Matrix<RHS,R,C>& y) { if(x.rows() == 0 && x.cols() == 0) { x = y; return ; } stan::math::check_matching_dims("assign", "x", x, "y", y); for (int i = 0; i < x.size(); ++i) assign(x(i),y(i)); }

— Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1369#issuecomment-83228438.

alashworth commented 5 years ago

Comment by bgoodri Thursday Mar 19, 2015 at 00:18 GMT

I agree that it seems hard to disallow the former.

But a matrix[0,0] is currently not useful for anything substantive; it just means that you don't have to change your code when you leave a variable out of a model. So, making matrix[0,0] tantamount to "unknown rows and columns" is a small step forward in the usefulness department, it just looks worse than an undimensioned matrix declaration so I doubt it would often get written intentionally. It may rarely get written unintentionally but that is a self-inflicted bug by the user that would usually get exposed later in the Stan code.

On Wed, Mar 18, 2015 at 7:56 PM, Bob Carpenter notifications@github.com wrote:

Before going ahead with this, I'd like some feedback from everyone else that they think it's a good idea to allow:

matrix[2,3] y; matrix[0,0] x; x <- y;

or

matrix[2,3] y; matrix x; x <- y;

I'm less happy with the former, but if we allow the latter, we're pretty much going to have to allow the former.

Bob

On Mar 19, 2015, at 10:35 AM, bgoodri notifications@github.com wrote:

No, I meant in assign.hpp to allow .stan programs with std::vectors of Eigen matrices with different sizes:

/**

Copy the right-hand side's value to the left-hand side

variable. *

The assign() function is overloaded. This

instance will be called for arguments that are both

Eigen::Matrix types and whose shapes match. The

shapes are specified in the row and column template parameters. *

@tparam LHS Type of left-hand side matrix elements.

@tparam RHS Type of right-hand side matrix elements.

@tparam R Row shape of both matrices.

@tparam C Column shape of both mtarices.

@param x Left-hand side matrix.

@param y Right-hand side matrix.

@throw std::invalid_argument if sizes do not match. */ template <typename LHS, typename RHS, int R, int C> inline void assign(Eigen::Matrix<LHS,R,C>& x, const Eigen::Matrix<RHS,R,C>& y) { if(x.rows() == 0 && x.cols() == 0) { x = y; return ; } stan::math::check_matching_dims("assign", "x", x, "y", y); for (int i = 0; i < x.size(); ++i) assign(x(i),y(i)); }

— Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1369#issuecomment-83227490.

alashworth commented 5 years ago

Comment by syclik Friday Mar 20, 2015 at 14:46 GMT

Sorry for taking so long to catch up.

First off, I disagree with allowing matrix[0,0] stand for a resizable matrix. Hopefully Stan is now compatible with 0 x 0 matrices, 0-length vectors, and 0-length row vectors. Early on, I found myself wanting to draw from priors and would intentionally write models with size 0 to allow for this. Perhaps the behavior to use this functionality within the Stan language wouldn't be affected, but I think this might allow users (me) to write models that behaved differently than intended.

Regarding matrix without sizes representing a resizable matrix, it does make some sense, but I'm afraid it'll make the user experience a little worse. The problem is when people forget to size matrices, we'll lose the power to write better error messages due to the strict typing at that point. Yes, I know, we don't proactively stop now, but the code is almost ready to do so.

Rather than just saying no, my suggestion would be to create a new type, something called dynamic_matrix. (I was inspired by Eigen's naming... and this is just a suggestion, we can call it whatever makes sense.) I'm really trying not to get ahead of myself and thinking about implementation because the C++ types would be the same (Eigen::Matrix), but we would want to do different checking. Perhaps we would need to have another set of assign calls or something like that? Not sure.

In short: I don't like matrix[0,0] or just matrix, but would prefer dynamic_matrix.

alashworth commented 5 years ago

Comment by bgoodri Friday Mar 20, 2015 at 15:25 GMT

This wouldn't limit the ability to use a size zero matrix as a placeholder in the model. If you don't resize it, then it stays size zero.

I'm not sure what the code is almost ready to do with regards to error messages. I think what would happen is whatever would happen if someone declared a size zero matrix and then did something buggy:

matrix[0,0] X; print(X[1,2]); // error

matrix X; // equivalent to matrix[0,0] X; print(X[1,2]); // error

matrix X; // equivalent to matrix[0,0] X; X <- rep_matrix(0, 2, 2); // resized to 2x2 print(X[1,2]); // prints 0 correctly

I don't think adding the Stan keyword dynamic would hurt anything but how it help if the construct gets parsed to an Eigen::Matrix<T,Eigen::Dynamic,Eigen::Dynamic> in C++? Are you thinking of subclassing from Eigen::Matrix?

On Fri, Mar 20, 2015 at 10:46 AM, Daniel Lee notifications@github.com wrote:

Sorry for taking so long to catch up.

First off, I disagree with allowing matrix[0,0] stand for a resizable matrix. Hopefully Stan is now compatible with 0 x 0 matrices, 0-length vectors, and 0-length row vectors. Early on, I found myself wanting to draw from priors and would intentionally write models with size 0 to allow for this. Perhaps the behavior to use this functionality within the Stan language wouldn't be affected, but I think this might allow users (me) to write models that behaved differently than intended.

Regarding matrix without sizes representing a resizable matrix, it does make some sense, but I'm afraid it'll make the user experience a little worse. The problem is when people forget to size matrices, we'll lose the power to write better error messages due to the strict typing at that point. Yes, I know, we don't proactively stop now, but the code is almost ready to do so.

Rather than just saying no, my suggestion would be to create a new type, something called dynamic_matrix. (I was inspired by Eigen's naming... and this is just a suggestion, we can call it whatever makes sense.) I'm really trying not to get ahead of myself and thinking about implementation because the C++ types would be the same (Eigen::Matrix), but we would want to do different checking. Perhaps we would need to have another set of assign calls or something like that? Not sure.

In short: I don't like matrix[0,0] or just matrix, but would prefer dynamic_matrix.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1369#issuecomment-84034932.

alashworth commented 5 years ago

Comment by betanalpha Friday Mar 20, 2015 at 15:49 GMT

I agree that if a type is meant to be resizable we should make that explicit. I think most people assume that their matrix sizes are constant, and that’s a good thing.

On Mar 20, 2015, at 3:25 PM, bgoodri notifications@github.com wrote:

This wouldn't limit the ability to use a size zero matrix as a placeholder in the model. If you don't resize it, then it stays size zero.

I'm not sure what the code is almost ready to do with regards to error messages. I think what would happen is whatever would happen if someone declared a size zero matrix and then did something buggy:

matrix[0,0] X; print(X[1,2]); // error

matrix X; // equivalent to matrix[0,0] X; print(X[1,2]); // error

matrix X; // equivalent to matrix[0,0] X; X <- rep_matrix(0, 2, 2); // resized to 2x2 print(X[1,2]); // prints 0 correctly

I don't think adding the Stan keyword dynamic would hurt anything but how it help if the construct gets parsed to an Eigen::Matrix<T,Eigen::Dynamic,Eigen::Dynamic> in C++? Are you thinking of subclassing from Eigen::Matrix?

On Fri, Mar 20, 2015 at 10:46 AM, Daniel Lee notifications@github.com wrote:

Sorry for taking so long to catch up.

First off, I disagree with allowing matrix[0,0] stand for a resizable matrix. Hopefully Stan is now compatible with 0 x 0 matrices, 0-length vectors, and 0-length row vectors. Early on, I found myself wanting to draw from priors and would intentionally write models with size 0 to allow for this. Perhaps the behavior to use this functionality within the Stan language wouldn't be affected, but I think this might allow users (me) to write models that behaved differently than intended.

Regarding matrix without sizes representing a resizable matrix, it does make some sense, but I'm afraid it'll make the user experience a little worse. The problem is when people forget to size matrices, we'll lose the power to write better error messages due to the strict typing at that point. Yes, I know, we don't proactively stop now, but the code is almost ready to do so.

Rather than just saying no, my suggestion would be to create a new type, something called dynamic_matrix. (I was inspired by Eigen's naming... and this is just a suggestion, we can call it whatever makes sense.) I'm really trying not to get ahead of myself and thinking about implementation because the C++ types would be the same (Eigen::Matrix), but we would want to do different checking. Perhaps we would need to have another set of assign calls or something like that? Not sure.

In short: I don't like matrix[0,0] or just matrix, but would prefer dynamic_matrix.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1369#issuecomment-84034932.

— Reply to this email directly or view it on GitHub.

alashworth commented 5 years ago

Comment by bgoodri Friday Mar 20, 2015 at 15:58 GMT

Do you mean, just for users to remind themselves that it is resizeable?

resizeable matrix X; X <- rep_matrix(0, 2, 2); print(X[1,2]);

That's fine by me, but how would it be different on the C++ as

matrix[0,0] X; X <- rep_matrix(0, 2, 2); print(X[1,2]);

On Fri, Mar 20, 2015 at 11:49 AM, Michael Betancourt < notifications@github.com> wrote:

I agree that if a type is meant to be resizable we should make that explicit. I think most people assume that their matrix sizes are constant, and that’s a good thing.

On Mar 20, 2015, at 3:25 PM, bgoodri notifications@github.com wrote:

This wouldn't limit the ability to use a size zero matrix as a placeholder in the model. If you don't resize it, then it stays size zero.

I'm not sure what the code is almost ready to do with regards to error messages. I think what would happen is whatever would happen if someone declared a size zero matrix and then did something buggy:

matrix[0,0] X; print(X[1,2]); // error

matrix X; // equivalent to matrix[0,0] X; print(X[1,2]); // error

matrix X; // equivalent to matrix[0,0] X; X <- rep_matrix(0, 2, 2); // resized to 2x2 print(X[1,2]); // prints 0 correctly

I don't think adding the Stan keyword dynamic would hurt anything but how it help if the construct gets parsed to an Eigen::Matrix<T,Eigen::Dynamic,Eigen::Dynamic> in C++? Are you thinking of subclassing from Eigen::Matrix?

On Fri, Mar 20, 2015 at 10:46 AM, Daniel Lee notifications@github.com wrote:

Sorry for taking so long to catch up.

First off, I disagree with allowing matrix[0,0] stand for a resizable matrix. Hopefully Stan is now compatible with 0 x 0 matrices, 0-length vectors, and 0-length row vectors. Early on, I found myself wanting to draw from priors and would intentionally write models with size 0 to allow for this. Perhaps the behavior to use this functionality within the Stan language wouldn't be affected, but I think this might allow users (me) to write models that behaved differently than intended.

Regarding matrix without sizes representing a resizable matrix, it does make some sense, but I'm afraid it'll make the user experience a little worse. The problem is when people forget to size matrices, we'll lose the power to write better error messages due to the strict typing at that point. Yes, I know, we don't proactively stop now, but the code is almost ready to do so.

Rather than just saying no, my suggestion would be to create a new type, something called dynamic_matrix. (I was inspired by Eigen's naming... and this is just a suggestion, we can call it whatever makes sense.) I'm really trying not to get ahead of myself and thinking about implementation because the C++ types would be the same (Eigen::Matrix), but we would want to do different checking. Perhaps we would need to have another set of assign calls or something like that? Not sure.

In short: I don't like matrix[0,0] or just matrix, but would prefer dynamic_matrix.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1369#issuecomment-84034932.

— Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1369#issuecomment-84053543.

alashworth commented 5 years ago

Comment by betanalpha Friday Mar 20, 2015 at 16:00 GMT

Yeah, I’m just thinking about the users — the underlying code can be exactly the same.

On Mar 20, 2015, at 3:58 PM, bgoodri notifications@github.com wrote:

Do you mean, just for users to remind themselves that it is resizeable?

resizeable matrix X; X <- rep_matrix(0, 2, 2); print(X[1,2]);

That's fine by me, but how would it be different on the C++ as

matrix[0,0] X; X <- rep_matrix(0, 2, 2); print(X[1,2]);

On Fri, Mar 20, 2015 at 11:49 AM, Michael Betancourt < notifications@github.com> wrote:

I agree that if a type is meant to be resizable we should make that explicit. I think most people assume that their matrix sizes are constant, and that’s a good thing.

On Mar 20, 2015, at 3:25 PM, bgoodri notifications@github.com wrote:

This wouldn't limit the ability to use a size zero matrix as a placeholder in the model. If you don't resize it, then it stays size zero.

I'm not sure what the code is almost ready to do with regards to error messages. I think what would happen is whatever would happen if someone declared a size zero matrix and then did something buggy:

matrix[0,0] X; print(X[1,2]); // error

matrix X; // equivalent to matrix[0,0] X; print(X[1,2]); // error

matrix X; // equivalent to matrix[0,0] X; X <- rep_matrix(0, 2, 2); // resized to 2x2 print(X[1,2]); // prints 0 correctly

I don't think adding the Stan keyword dynamic would hurt anything but how it help if the construct gets parsed to an Eigen::Matrix<T,Eigen::Dynamic,Eigen::Dynamic> in C++? Are you thinking of subclassing from Eigen::Matrix?

On Fri, Mar 20, 2015 at 10:46 AM, Daniel Lee notifications@github.com wrote:

Sorry for taking so long to catch up.

First off, I disagree with allowing matrix[0,0] stand for a resizable matrix. Hopefully Stan is now compatible with 0 x 0 matrices, 0-length vectors, and 0-length row vectors. Early on, I found myself wanting to draw from priors and would intentionally write models with size 0 to allow for this. Perhaps the behavior to use this functionality within the Stan language wouldn't be affected, but I think this might allow users (me) to write models that behaved differently than intended.

Regarding matrix without sizes representing a resizable matrix, it does make some sense, but I'm afraid it'll make the user experience a little worse. The problem is when people forget to size matrices, we'll lose the power to write better error messages due to the strict typing at that point. Yes, I know, we don't proactively stop now, but the code is almost ready to do so.

Rather than just saying no, my suggestion would be to create a new type, something called dynamic_matrix. (I was inspired by Eigen's naming... and this is just a suggestion, we can call it whatever makes sense.) I'm really trying not to get ahead of myself and thinking about implementation because the C++ types would be the same (Eigen::Matrix), but we would want to do different checking. Perhaps we would need to have another set of assign calls or something like that? Not sure.

In short: I don't like matrix[0,0] or just matrix, but would prefer dynamic_matrix.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1369#issuecomment-84034932.

— Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1369#issuecomment-84053543.

— Reply to this email directly or view it on GitHub.

alashworth commented 5 years ago

Comment by bgoodri Friday Mar 20, 2015 at 16:06 GMT

If Bob wants to add a dynamic or resizeable keyword, go ahead. One concern is that if someone declares a resizeable matrix, they might think that they can resize it multiple times, when really it is more of a "deferred size" thing: it is temporarily 0x0 and gets changed to a fixed size that is not 0x0.

On Fri, Mar 20, 2015 at 12:00 PM, Michael Betancourt < notifications@github.com> wrote:

Yeah, I’m just thinking about the users — the underlying code can be exactly the same.

On Mar 20, 2015, at 3:58 PM, bgoodri notifications@github.com wrote:

Do you mean, just for users to remind themselves that it is resizeable?

resizeable matrix X; X <- rep_matrix(0, 2, 2); print(X[1,2]);

That's fine by me, but how would it be different on the C++ as

matrix[0,0] X; X <- rep_matrix(0, 2, 2); print(X[1,2]);

On Fri, Mar 20, 2015 at 11:49 AM, Michael Betancourt < notifications@github.com> wrote:

I agree that if a type is meant to be resizable we should make that explicit. I think most people assume that their matrix sizes are constant, and that’s a good thing.

On Mar 20, 2015, at 3:25 PM, bgoodri notifications@github.com wrote:

This wouldn't limit the ability to use a size zero matrix as a placeholder in the model. If you don't resize it, then it stays size zero.

I'm not sure what the code is almost ready to do with regards to error messages. I think what would happen is whatever would happen if someone declared a size zero matrix and then did something buggy:

matrix[0,0] X; print(X[1,2]); // error

matrix X; // equivalent to matrix[0,0] X; print(X[1,2]); // error

matrix X; // equivalent to matrix[0,0] X; X <- rep_matrix(0, 2, 2); // resized to 2x2 print(X[1,2]); // prints 0 correctly

I don't think adding the Stan keyword dynamic would hurt anything but how it help if the construct gets parsed to an Eigen::Matrix<T,Eigen::Dynamic,Eigen::Dynamic> in C++? Are you thinking of subclassing from Eigen::Matrix?

On Fri, Mar 20, 2015 at 10:46 AM, Daniel Lee < notifications@github.com> wrote:

Sorry for taking so long to catch up.

First off, I disagree with allowing matrix[0,0] stand for a resizable matrix. Hopefully Stan is now compatible with 0 x 0 matrices, 0-length vectors, and 0-length row vectors. Early on, I found myself wanting to draw from priors and would intentionally write models with size 0 to allow for this. Perhaps the behavior to use this functionality within the Stan language wouldn't be affected, but I think this might allow users (me) to write models that behaved differently than intended.

Regarding matrix without sizes representing a resizable matrix, it does make some sense, but I'm afraid it'll make the user experience a little worse. The problem is when people forget to size matrices, we'll lose the power to write better error messages due to the strict typing at that point. Yes, I know, we don't proactively stop now, but the code is almost ready to do so.

Rather than just saying no, my suggestion would be to create a new type, something called dynamic_matrix. (I was inspired by Eigen's naming... and this is just a suggestion, we can call it whatever makes sense.) I'm really trying not to get ahead of myself and thinking about implementation because the C++ types would be the same (Eigen::Matrix), but we would want to do different checking. Perhaps we would need to have another set of assign calls or something like that? Not sure.

In short: I don't like matrix[0,0] or just matrix, but would prefer dynamic_matrix.

— Reply to this email directly or view it on GitHub < https://github.com/stan-dev/stan/issues/1369#issuecomment-84034932>.

— Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1369#issuecomment-84053543.

— Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1369#issuecomment-84055769.

alashworth commented 5 years ago

Comment by bob-carpenter Saturday Mar 21, 2015 at 01:29 GMT

Adding a no-op keyword is easy, but I don't think that's a good idea.

Right now, we allow size-zero vectors and matrices, they print just fine, and can be used as data to provide a no-op in appropriate situations (e.g., no data, or no predictions).

So Ben's right that if we hack in a solution by allowing matrix[0,0](and presumably vector[0] and row_vector[0] for consistency) to be assigned to anything, then

they would not be reassignable after the first non-zero assign, and
a matrix declared as matrix[0,0] would still be reassigable

Anything else is going to take more work, like adding a truly resizable matrix data type. I'm not even sure how to do that with the code generation we have other than creating a whole new type. And even then, I'm not sure.

I was thinking we'd have ragged arrays with declared array sizes. Something like

int Ms[K]; int Ns[K]; matrix[Ms,Ns] a[K];

where a[k] is of size Ms[k] x Ns[k]. That's consistent with how we do things now --- all sizes fixed at declaration time other than function arguments and returns, which are fixed on call and return rather than when declared.

On Mar 21, 2015, at 3:06 AM, bgoodri notifications@github.com wrote:

If Bob wants to add a dynamic or resizeable keyword, go ahead. One concern is that if someone declares a resizeable matrix, they might think that they can resize it multiple times, when really it is more of a "deferred size" thing: it is temporarily 0x0 and gets changed to a fixed size that is not 0x0.

On Fri, Mar 20, 2015 at 12:00 PM, Michael Betancourt < notifications@github.com> wrote:

Yeah, I’m just thinking about the users — the underlying code can be exactly the same.

On Mar 20, 2015, at 3:58 PM, bgoodri notifications@github.com wrote:

Do you mean, just for users to remind themselves that it is resizeable?

resizeable matrix X; X <- rep_matrix(0, 2, 2); print(X[1,2]);

That's fine by me, but how would it be different on the C++ as

matrix[0,0] X; X <- rep_matrix(0, 2, 2); print(X[1,2]);

On Fri, Mar 20, 2015 at 11:49 AM, Michael Betancourt < notifications@github.com> wrote:

I agree that if a type is meant to be resizable we should make that explicit. I think most people assume that their matrix sizes are constant, and that’s a good thing.

On Mar 20, 2015, at 3:25 PM, bgoodri notifications@github.com wrote:

This wouldn't limit the ability to use a size zero matrix as a placeholder in the model. If you don't resize it, then it stays size zero.

I'm not sure what the code is almost ready to do with regards to error messages. I think what would happen is whatever would happen if someone declared a size zero matrix and then did something buggy:

matrix[0,0] X; print(X[1,2]); // error

matrix X; // equivalent to matrix[0,0] X; print(X[1,2]); // error

matrix X; // equivalent to matrix[0,0] X; X <- rep_matrix(0, 2, 2); // resized to 2x2 print(X[1,2]); // prints 0 correctly

I don't think adding the Stan keyword dynamic would hurt anything but how it help if the construct gets parsed to an Eigen::Matrix<T,Eigen::Dynamic,Eigen::Dynamic> in C++? Are you thinking of subclassing from Eigen::Matrix?

On Fri, Mar 20, 2015 at 10:46 AM, Daniel Lee < notifications@github.com> wrote:

Sorry for taking so long to catch up.

First off, I disagree with allowing matrix[0,0] stand for a resizable matrix. Hopefully Stan is now compatible with 0 x 0 matrices, 0-length vectors, and 0-length row vectors. Early on, I found myself wanting to draw from priors and would intentionally write models with size 0 to allow for this. Perhaps the behavior to use this functionality within the Stan language wouldn't be affected, but I think this might allow users (me) to write models that behaved differently than intended.

Regarding matrix without sizes representing a resizable matrix, it does make some sense, but I'm afraid it'll make the user experience a little worse. The problem is when people forget to size matrices, we'll lose the power to write better error messages due to the strict typing at that point. Yes, I know, we don't proactively stop now, but the code is almost ready to do so.

Rather than just saying no, my suggestion would be to create a new type, something called dynamic_matrix. (I was inspired by Eigen's naming... and this is just a suggestion, we can call it whatever makes sense.) I'm really trying not to get ahead of myself and thinking about implementation because the C++ types would be the same (Eigen::Matrix), but we would want to do different checking. Perhaps we would need to have another set of assign calls or something like that? Not sure.

In short: I don't like matrix[0,0] or just matrix, but would prefer dynamic_matrix.

— Reply to this email directly or view it on GitHub < https://github.com/stan-dev/stan/issues/1369#issuecomment-84034932>.

— Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1369#issuecomment-84053543.

— Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1369#issuecomment-84055769.

— Reply to this email directly or view it on GitHub.

alashworth commented 5 years ago

Comment by syclik Wednesday Mar 25, 2015 at 19:13 GMT

I think Bob's suggestion is what makes the most sense. I'm guessing will take a long time to figure out, so, let's talk about what we can do in the short term.

@bgoodri, I still think allowing matrix[0,0] to mean something other than a zero-row, zero-column matrix is a bit dangerous and could lead to model building errors that aren't clear. I also think having a naked matrix call is also just asking for trouble, but it's better in my book.

I know, I didn't come up with a decent counter-example, but perhaps something like:

data {
   int N;
   int M;
   matrix[N,N] X1;
   matrix[M,M] X2;
}
transformed data {
   matrix[M,M] L_X1;
   matrix[M,M] L_X2;
   L_X1 <- cholesky_decompose(X1);
   L_X2 <- cholesky_decompose(X2);
}
...

That'll work if M is 0. Seems odd and error-prone to me.

alashworth commented 5 years ago

Comment by bgoodri Wednesday Mar 25, 2015 at 21:59 GMT

If we implement true ragged arrays, that would be better because it would work in data and parameters as well. But we've been talking about doing that since 2011. And only Bob can do something like that with the language. Whereas it takes 4 lines of already-written code to assign to a 0x0 matrix to overcome a restriction we imposed on the language for a conference paper.

On Wed, Mar 25, 2015 at 3:13 PM, Daniel Lee notifications@github.com wrote:

I think Bob's suggestion is what makes the most sense. I'm guessing will take a long time to figure out, so, let's talk about what we can do in the short term.

@bgoodri https://github.com/bgoodri, I still think allowing matrix[0,0] to mean something other than a zero-row, zero-column matrix is a bit dangerous and could lead to model building errors that aren't clear. I also think having a naked matrix call is also just asking for trouble, but

it's better in my book.

I know, I didn't come up with a decent counter-example, but perhaps something like:

data { int N; int M; matrix[N,N] X1; matrix[M,M] X2; } transformed data { matrix[M,M] L_X1; matrix[M,M] L_X2; L_X1 <- cholesky_decompose(X1); L_X2 <- cholesky_decompose(X2); } ...

That'll work if M is 0. Seems odd and error-prone to me.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1369#issuecomment-86178384.

alashworth commented 5 years ago

Comment by bob-carpenter Wednesday Mar 25, 2015 at 23:05 GMT

This project has to be a compromise. I don't like this proposal because it messes up the clean conception of sizing that we have everywhere. But I agree that we've been wanting to do this forever. If I could ever stop treading water, I could get lots of these features in. Maybe I'll try not responding to the mailing list again and see if anyone picks up the slack. Daniel's been volunteering.

Don't forget doc and testing. It's may be 4 lines of code to change the assign function, but then there's testing and mods to the doc everywhere we talk about assignment.

By the way, is this only for matrices, or also for vectors and row vectors?

I think it'll actually require a fair amount of doc and testing. Can we do this reassignment in the transformed data block or just with local variables? If we can do it in the former, then what happens to the validation tests for sizes that run at the end of the blocks --- those will also have to change to match with tests to go along? I don't think we can do it in transformed parameters, because that could change the number of parameters on a per-iteration basis. So then all the discussion of matrix assignment in the doc needs to be qualified.

Maybe we can come up with an intermediate solution, such as having a matrix declared without any size restrictions, as in

matrix A;

And then only use this as a local variable. Or maybe in transformed data.

Then when using that on the LHS of an array, we'd not check sizes. That'd require a lot more code and it'd change explicitly the sort of implicit change that Ben's arguing for. That is, we'd now have a contrast between sized and unsized types, allowing assignment and presumably reassignment to unsized types. That's even more code and doc, but I think it's a much more coherent solution than modifying the behavior of size-0 matrices.

On Mar 26, 2015, at 8:59 AM, bgoodri notifications@github.com wrote:

If we implement true ragged arrays, that would be better because it would work in data and parameters as well. But we've been talking about doing that since 2011. And only Bob can do something like that with the language. Whereas it takes 4 lines of already-written code to assign to a 0x0 matrix to overcome a restriction we imposed on the language for a conference paper.

On Wed, Mar 25, 2015 at 3:13 PM, Daniel Lee notifications@github.com wrote:

I think Bob's suggestion is what makes the most sense. I'm guessing will take a long time to figure out, so, let's talk about what we can do in the short term.

@bgoodri https://github.com/bgoodri, I still think allowing matrix[0,0] to mean something other than a zero-row, zero-column matrix is a bit dangerous and could lead to model building errors that aren't clear. I also think having a naked matrix call is also just asking for trouble, but

it's better in my book.

I know, I didn't come up with a decent counter-example, but perhaps something like:

data { int N; int M; matrix[N,N] X1; matrix[M,M] X2; } transformed data { matrix[M,M] L_X1; matrix[M,M] L_X2; L_X1 <- cholesky_decompose(X1); L_X2 <- cholesky_decompose(X2); } ...

That'll work if M is 0. Seems odd and error-prone to me.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1369#issuecomment-86178384.

— Reply to this email directly or view it on GitHub.

alashworth commented 5 years ago

Comment by bgoodri Wednesday Mar 25, 2015 at 23:27 GMT

If we are going to support genuine ragged arrays like

matrix[rows,cols] X[elements];

where rows and cols are vectors, then I think we should allow that anywhere.

If it is just the original hack where a 0x0 matrix or 0x1 vector can get reassigned to any genuine size, then it definitely can't be data or parameters and if it can only be model or generated quantities, then that would still be useful.

More generally, I am not a fan of limiting language features that lots of people would find useful in order to catch bugs in user code. If we can catch bugs like that essentially costlessly, then okay. But, such bugs will probably manifest themselves elsewhere and if not, too bad for that particular user, as opposed to bad for all users. So, in this case, are you saying that

transformed parameters { cov_matrix[K,K] X; X <- crossprod(Y); }

currently checks at the end of the block that X is, in fact, KxK. If so, I agree it would be hard to check if X were declared 0x0 and resized, but I would favor dropping that check. But we could still check that it is positive definite without regard to the size, right?

On Wed, Mar 25, 2015 at 7:05 PM, Bob Carpenter notifications@github.com wrote:

This project has to be a compromise. I don't like this proposal because it messes up the clean conception of sizing that we have everywhere. But I agree that we've been wanting to do this forever. If I could ever stop treading water, I could get lots of these features in. Maybe I'll try not responding to the mailing list again and see if anyone picks up the slack. Daniel's been volunteering.

Don't forget doc and testing. It's may be 4 lines of code to change the assign function, but then there's testing and mods to the doc everywhere we talk about assignment.

By the way, is this only for matrices, or also for vectors and row vectors?

I think it'll actually require a fair amount of doc and testing. Can we do this reassignment in the transformed data block or just with local variables? If we can do it in the former, then what happens to the validation tests for sizes that run at the end of the blocks --- those will also have to change to match with tests to go along? I don't think we can do it in transformed parameters, because that could change the number of parameters on a per-iteration basis. So then all the discussion of matrix assignment in the doc needs to be qualified.

Maybe we can come up with an intermediate solution, such as having a matrix declared without any size restrictions, as in

matrix A;

And then only use this as a local variable. Or maybe in transformed data.

Then when using that on the LHS of an array, we'd not check sizes. That'd require a lot more code and it'd change explicitly the sort of implicit change that Ben's arguing for. That is, we'd now have a contrast between sized and unsized types, allowing assignment and presumably reassignment to unsized types. That's even more code and doc, but I think it's a much more coherent solution than modifying the behavior of size-0 matrices.

Bob

On Mar 26, 2015, at 8:59 AM, bgoodri notifications@github.com wrote:

If we implement true ragged arrays, that would be better because it would work in data and parameters as well. But we've been talking about doing that since 2011. And only Bob can do something like that with the language. Whereas it takes 4 lines of already-written code to assign to a 0x0 matrix to overcome a restriction we imposed on the language for a conference paper.

On Wed, Mar 25, 2015 at 3:13 PM, Daniel Lee notifications@github.com wrote:

I think Bob's suggestion is what makes the most sense. I'm guessing will take a long time to figure out, so, let's talk about what we can do in the short term.

@bgoodri https://github.com/bgoodri, I still think allowing matrix[0,0] to mean something other than a zero-row, zero-column matrix is a bit dangerous and could lead to model building errors that aren't clear. I also think having a naked matrix call is also just asking for trouble, but

it's better in my book.

I know, I didn't come up with a decent counter-example, but perhaps something like:

data { int N; int M; matrix[N,N] X1; matrix[M,M] X2; } transformed data { matrix[M,M] L_X1; matrix[M,M] L_X2; L_X1 <- cholesky_decompose(X1); L_X2 <- cholesky_decompose(X2); } ...

That'll work if M is 0. Seems odd and error-prone to me.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1369#issuecomment-86178384.

— Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1369#issuecomment-86248040.

alashworth commented 5 years ago

Comment by bob-carpenter Thursday Mar 26, 2015 at 00:21 GMT

I agree about allowing declared ragged arrays anywhere. And I fully admit I don't have this on my to-do list for any time soon.

I strongly disagree about turning off size checks at the ends of blocks. And in general about favoring functionality over error checking.

Maybe we can discuss at the next Stan meeting?

On Mar 26, 2015, at 10:27 AM, bgoodri notifications@github.com wrote:

If we are going to support genuine ragged arrays like

matrix[rows,cols] X[elements];

where rows and cols are vectors, then I think we should allow that anywhere.

If it is just the original hack where a 0x0 matrix or 0x1 vector can get reassigned to any genuine size, then it definitely can't be data or parameters and if it can only be model or generated quantities, then that would still be useful.

More generally, I am not a fan of limiting language features that lots of people would find useful in order to catch bugs in user code. If we can catch bugs like that essentially costlessly, then okay. But, such bugs will probably manifest themselves elsewhere and if not, too bad for that particular user, as opposed to bad for all users. So, in this case, are you saying that

transformed parameters { cov_matrix[K,K] X; X <- crossprod(Y); }

currently checks at the end of the block that X is, in fact, KxK. If so, I agree it would be hard to check if X were declared 0x0 and resized, but I would favor dropping that check. But we could still check that it is positive definite without regard to the size, right?

On Wed, Mar 25, 2015 at 7:05 PM, Bob Carpenter notifications@github.com wrote:

This project has to be a compromise. I don't like this proposal because it messes up the clean conception of sizing that we have everywhere. But I agree that we've been wanting to do this forever. If I could ever stop treading water, I could get lots of these features in. Maybe I'll try not responding to the mailing list again and see if anyone picks up the slack. Daniel's been volunteering.

Don't forget doc and testing. It's may be 4 lines of code to change the assign function, but then there's testing and mods to the doc everywhere we talk about assignment.

By the way, is this only for matrices, or also for vectors and row vectors?

I think it'll actually require a fair amount of doc and testing. Can we do this reassignment in the transformed data block or just with local variables? If we can do it in the former, then what happens to the validation tests for sizes that run at the end of the blocks --- those will also have to change to match with tests to go along? I don't think we can do it in transformed parameters, because that could change the number of parameters on a per-iteration basis. So then all the discussion of matrix assignment in the doc needs to be qualified.

Maybe we can come up with an intermediate solution, such as having a matrix declared without any size restrictions, as in

matrix A;

And then only use this as a local variable. Or maybe in transformed data.

Then when using that on the LHS of an array, we'd not check sizes. That'd require a lot more code and it'd change explicitly the sort of implicit change that Ben's arguing for. That is, we'd now have a contrast between sized and unsized types, allowing assignment and presumably reassignment to unsized types. That's even more code and doc, but I think it's a much more coherent solution than modifying the behavior of size-0 matrices.

Bob

On Mar 26, 2015, at 8:59 AM, bgoodri notifications@github.com wrote:

If we implement true ragged arrays, that would be better because it would work in data and parameters as well. But we've been talking about doing that since 2011. And only Bob can do something like that with the language. Whereas it takes 4 lines of already-written code to assign to a 0x0 matrix to overcome a restriction we imposed on the language for a conference paper.

On Wed, Mar 25, 2015 at 3:13 PM, Daniel Lee notifications@github.com wrote:

I think Bob's suggestion is what makes the most sense. I'm guessing will take a long time to figure out, so, let's talk about what we can do in the short term.

@bgoodri https://github.com/bgoodri, I still think allowing matrix[0,0] to mean something other than a zero-row, zero-column matrix is a bit dangerous and could lead to model building errors that aren't clear. I also think having a naked matrix call is also just asking for trouble, but

it's better in my book.

I know, I didn't come up with a decent counter-example, but perhaps something like:

data { int N; int M; matrix[N,N] X1; matrix[M,M] X2; } transformed data { matrix[M,M] L_X1; matrix[M,M] L_X2; L_X1 <- cholesky_decompose(X1); L_X2 <- cholesky_decompose(X2); } ...

That'll work if M is 0. Seems odd and error-prone to me.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1369#issuecomment-86178384.

— Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1369#issuecomment-86248040.

— Reply to this email directly or view it on GitHub.

alashworth commented 5 years ago

Comment by bgoodri Thursday Mar 26, 2015 at 01:57 GMT

If there were some way to turn checks off instead of taking them out of the language completely, that would be okay but we haven't implemented that in years either. To take it somewhat off track, for huge K, something like

transformed parameters { corr_matrix[K,K] Sigma; Sigma <- multiply_lower_tri_self_transpose(L); }

is infeasible and may very well fail the positive definiteness check for numerical reasons. So, the "best" way to work around this in Stan is to duplicate code in the generated quantities block

model { corr_matrix[K,K] Sigma; Sigma <- multiply_lower_tri_self_transpose(L); ... } generated quantities { corr_matrix[K,K] Sigma; Sigma <- multiply_lower_tri_self_transpose(L); }

And if the "best" thing to do is to do something that is inconsistent with one of the central tenets of programming, then the design has a problem. This problem could be solved by disabling the positive definiteness check and the matrix resizing issue could be solved by disabling the size check. But if checks like these can't be disabled, then we are punishing people who write correct code and who want to take advantage of unimplemented features in order to catch mistakes of people who write incorrect code.

If someone defines a corr_matrix or cov_matrix in the transformed parameters block that is not positive definite by construction, then Stan is drawing from an invalid posterior distribution even if the lack of positive definiteness is caught because the priors on the parameters do not integrate to 1 when the positive definiteness check enforces a region of zero prior density. And there is nothing that we can do about the fact that it is an unsound posterior and yet we do validity checks on the objects that slow sound code down. It's fine if you want the checks to be on in a test run to make sure that what you intended to be a positive definite transformation is, in fact, positive definite. But it is bad to force such checks to be on always.

Ben

On Wed, Mar 25, 2015 at 8:21 PM, Bob Carpenter notifications@github.com wrote:

I agree about allowing declared ragged arrays anywhere. And I fully admit I don't have this on my to-do list for any time soon.

I strongly disagree about turning off size checks at the ends of blocks. And in general about favoring functionality over error checking.

Maybe we can discuss at the next Stan meeting?

On Mar 26, 2015, at 10:27 AM, bgoodri notifications@github.com wrote:

If we are going to support genuine ragged arrays like

matrix[rows,cols] X[elements];

where rows and cols are vectors, then I think we should allow that anywhere.

If it is just the original hack where a 0x0 matrix or 0x1 vector can get reassigned to any genuine size, then it definitely can't be data or parameters and if it can only be model or generated quantities, then that would still be useful.

More generally, I am not a fan of limiting language features that lots of people would find useful in order to catch bugs in user code. If we can catch bugs like that essentially costlessly, then okay. But, such bugs will probably manifest themselves elsewhere and if not, too bad for that particular user, as opposed to bad for all users. So, in this case, are you saying that

transformed parameters { cov_matrix[K,K] X; X <- crossprod(Y); }

currently checks at the end of the block that X is, in fact, KxK. If so, I agree it would be hard to check if X were declared 0x0 and resized, but I would favor dropping that check. But we could still check that it is positive definite without regard to the size, right?

On Wed, Mar 25, 2015 at 7:05 PM, Bob Carpenter <notifications@github.com

wrote:

This project has to be a compromise. I don't like this proposal because it messes up the clean conception of sizing that we have everywhere. But I agree that we've been wanting to do this forever. If I could ever stop treading water, I could get lots of these features in. Maybe I'll try not responding to the mailing list again and see if anyone picks up the slack. Daniel's been volunteering.

Don't forget doc and testing. It's may be 4 lines of code to change the assign function, but then there's testing and mods to the doc everywhere we talk about assignment.

By the way, is this only for matrices, or also for vectors and row vectors?

I think it'll actually require a fair amount of doc and testing. Can we do this reassignment in the transformed data block or just with local variables? If we can do it in the former, then what happens to the validation tests for sizes that run at the end of the blocks --- those will also have to change to match with tests to go along? I don't think we can do it in transformed parameters, because that could change the number of parameters on a per-iteration basis. So then all the discussion of matrix assignment in the doc needs to be qualified.

Maybe we can come up with an intermediate solution, such as having a matrix declared without any size restrictions, as in

matrix A;

And then only use this as a local variable. Or maybe in transformed data.

Then when using that on the LHS of an array, we'd not check sizes. That'd require a lot more code and it'd change explicitly the sort of implicit change that Ben's arguing for. That is, we'd now have a contrast between sized and unsized types, allowing assignment and presumably reassignment to unsized types. That's even more code and doc, but I think it's a much more coherent solution than modifying the behavior of size-0 matrices.

Bob

On Mar 26, 2015, at 8:59 AM, bgoodri notifications@github.com wrote:

If we implement true ragged arrays, that would be better because it would work in data and parameters as well. But we've been talking about doing that since 2011. And only Bob can do something like that with the language. Whereas it takes 4 lines of already-written code to assign to a 0x0 matrix to overcome a restriction we imposed on the language for a conference paper.

On Wed, Mar 25, 2015 at 3:13 PM, Daniel Lee < notifications@github.com> wrote:

I think Bob's suggestion is what makes the most sense. I'm guessing will take a long time to figure out, so, let's talk about what we can do in the short term.

@bgoodri https://github.com/bgoodri, I still think allowing matrix[0,0] to mean something other than a zero-row, zero-column matrix is a bit dangerous and could lead to model building errors that aren't clear. I also think having a naked matrix call is also just asking for trouble, but

it's better in my book.

I know, I didn't come up with a decent counter-example, but perhaps something like:

data { int N; int M; matrix[N,N] X1; matrix[M,M] X2; } transformed data { matrix[M,M] L_X1; matrix[M,M] L_X2; L_X1 <- cholesky_decompose(X1); L_X2 <- cholesky_decompose(X2); } ...

That'll work if M is 0. Seems odd and error-prone to me.

— Reply to this email directly or view it on GitHub < https://github.com/stan-dev/stan/issues/1369#issuecomment-86178384>.

— Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1369#issuecomment-86248040.

— Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1369#issuecomment-86261497.

alashworth commented 5 years ago

Comment by bob-carpenter Thursday Mar 26, 2015 at 02:21 GMT

On Mar 26, 2015, at 12:57 PM, bgoodri notifications@github.com wrote:

If there were some way to turn checks off instead of taking them out of the language completely, that would be okay but we haven't implemented that in years either.

I'm OK with being able to disable all checks if someone wants to do it. Part of the refactoring Daniel and Rob undertook was supposed to address this or at least make it easier.

To take it somewhat off track, for huge K, something like

transformed parameters { corr_matrix[K,K] Sigma; Sigma <- multiply_lower_tri_self_transpose(L); }

is infeasible and may very well fail the positive definiteness check for numerical reasons. So, the "best" way to work around this in Stan is to duplicate code in the generated quantities block

model { corr_matrix[K,K] Sigma; Sigma <- multiply_lower_tri_self_transpose(L); ... }

You can't declare a corr_matrix in the model, but you could do this with matrix[K]. Local variables don't get checked for constraints, which is why you can't define constraints on them.

So if you define Sigma as a matrix[K,K] in the transformed parameters block, it won't do the checks either.

generated quantities { corr_matrix[K,K] Sigma; Sigma <- multiply_lower_tri_self_transpose(L); }

And if the "best" thing to do is to do something that is inconsistent with one of the central tenets of programming, then the design has a problem.

Agreed. This seems like a problem with our enforcing positive definiteness.

This problem could be solved by disabling the positive definiteness check and the matrix resizing issue could be solved by disabling the size check.

Where would it be OK to use a correlation matrix that doesn't pass positive-definiteness checks? Shouldn't it then just be declared as a matrix?

But if checks like these can't be disabled, then we are punishing people who write correct code and who want to take advantage of unimplemented features in order to catch mistakes of people who write incorrect code.

What I'm missing is how the code is correct if they're producing non-positive definite matrices and assigning them to correlation matrix variables.

Is the problem just that we need a different test?

If someone defines a corr_matrix or cov_matrix in the transformed parameters block that is not positive definite by construction, then Stan is drawing from an invalid posterior distribution even if the lack of positive definiteness is caught because the priors on the parameters do not integrate to 1 when the positive definiteness check enforces a region of zero prior density. And there is nothing that we can do about the fact that it is an unsound posterior and yet we do validity checks on the objects that slow sound code down. It's fine if you want the checks to be on in a test run to make sure that what you intended to be a positive definite transformation is, in fact, positive definite. But it is bad to force such checks to be on always.

I'm confused, because this seems to be arguing that we should enforce the checks to prevent drawing from an invalid posterior.

We could go further and just terminate the run if we get a "divergent" iteration in Michael sense and thus force users to step down the step sizes.

alashworth commented 5 years ago

Comment by syclik Thursday Mar 26, 2015 at 02:55 GMT

Ben, if you want us to turn off all checks, that isn't the hardest thing to plumb all the way through. The issue is that I think you underestimate how much Eigen and Boost likes to swallow errors. For example, the llt transform will take nans, infs, pretty much anything, and give back an "llt transform" that has no indicators that it's broken. I don't think they doc it as a precondition, but the transforms are obviously taking in bad matrices and then pushing out things that everything else thinks is good.

If you're ok with that, we can go ahead and find a way to disable all checks. It just makes me really, really nervous when we know that something, at runtime, will mask something bad. If you really want it in, create a new issue, give us some hints as to what you want. All checks off? A subset of checks off? matrix checks off? bounds? hopefully you get the point.

What I'm missing is how the code is correct if they're producing non-positive definite matrices and assigning them to correlation matrix variables.

Is the problem just that we need a different test?

I have the same question: what should we do?

alashworth commented 5 years ago

Comment by bgoodri Thursday Mar 26, 2015 at 02:56 GMT

On Wed, Mar 25, 2015 at 10:21 PM, Bob Carpenter notifications@github.com wrote:

And if the "best" thing to do is to do something that is inconsistent with one of the central tenets of programming, then the design has a problem.

Agreed. This seems like a problem with our enforcing positive definiteness.

This problem could be solved by disabling the positive definiteness check and the matrix resizing issue could be solved by disabling the size check.

Where would it be OK to use a correlation matrix that doesn't pass positive-definiteness checks? Shouldn't it then just be declared as a matrix?

It is a dilemma that is self-inflicted by the user who writes transformations that are not consistent with the constraints by construction.

If someone passes an indefinite symmetric matrix to a function like multi_normal, then any number multi_normal() outputs is not sound (hopefully it would throw an exception or return NaN).
If someone passes only the symmetric matrices that pass the positive definiteness check to a function like multi_normal, then multi_normal won't throw an exception or return NaN but the draws are not sound. Basically, the user should be truncating the relevant priors by dividing them by the non-constant 1 - Pr(indefinite) but there is no way to do that truncation in Stan.

So, the only way to get sound posterior draws is to only use transformations that are positive definite with probability one. And if do that, then you don't want to check for positive definiteness at runtime in production code both because it is slower and also because you can get positive definite matrices that are numerically indefinite.

For positive definiteness, there are ways of circumventing the check by just declaring it as a plain square matrix. But the larger point I am trying to make is that requiring everything to be checked for validity hurts people who write valid code in order to prevent people who write invalid code from hurting themselves. In the original case of size checking matrices, there is virtually no runtime cost but the cost is that we can't resize them and we can't implement C++ or user-defined functions that return multiple matrices of different sizes. So Sebastian can't do things like

transformed parameters { matrix my_stuff[P]; my_stuff <- make_stuff(lots, of, arguments); // for a user-defined make_stuff }

and export that single make_stuff function to R to use in posterior predictive simulations. But it is really the principle that everyone has to suffer in order to catch someone else's typos that bothers me.

Ben