next manual, 2.3.0 - Githubissues

bob-carpenter commented 10 years ago

Updates for next manual go here.

bob-carpenter commented 10 years ago

[x] Acknowledge Avraham Adler for doc updates.

randommm commented 10 years ago

In the manual we have (page 275 current develop branch):

int[] dims(T x) Returns an integer array containing the dimensions of x; the type of the argument T can be any Stan type with up to 8 array dimensions. int size(T[] x) Returns the number of elements in the array x; the type of the array T can be anything type.

yet, in function_signatures.h we have: // size() is polymorphic over arrays, so start i at 1 for (size_t i = 1; i < 8; ++i) { add("size",INT_T,expr_type(INT_T,i)); add("size",INT_T,expr_type(DOUBLE_T,i)); add("size",INT_T,expr_type(VECTOR_T,i)); add("size",INT_T,expr_type(ROW_VECTOR_T,i)); add("size",INT_T,expr_type(MATRIX_T,i)); }

// dims() is polymorphic by size for (size_t i = 0; i < 8; ++i) { add("dims",expr_type(INT_T,1),expr_type(INT_T,i)); add("dims",expr_type(INT_T,1),expr_type(DOUBLE_T,i)); add("dims",expr_type(INT_T,1),expr_type(VECTOR_T,i)); add("dims",expr_type(INT_T,1),expr_type(ROW_VECTOR_T,i)); add("dims",expr_type(INT_T,1),expr_type(MATRIX_T,i)); }

So, the args are equal, with only the return type being different.

Do you mind I open this issue branch to correct this to:

int[] dims(T[] x) Returns an integer array containing the dimensions of x; the type of the argument T can be any Stan type with up to 8 array dimensions. int size(T[] x) Returns the number of elements in the array x; the type of the argument T can be any Stan type with up to 8 array dimensions.

bob-carpenter commented 10 years ago

Sounds good.

But rather than

“the type of the argument T”

could you say

“the type of the array elements T up to 7 dimensions”

The [] takes care of the first dimension.

Bob

On Feb 18, 2014, at 5:20 PM, Marco Inacio notifications@github.com wrote:

In the manual we have (page 275 current develop branch):

int[] dims(T x) Returns an integer array containing the dimensions of x; the type of the argument T can be any Stan type with up to 8 array dimensions. int size(T[] x) Returns the number of elements in the array x; the type of the array T can be anything type.

yet, in function_signatures.h we have: // size() is polymorphic over arrays, so start i at 1 for (size_t i = 1; i < 8; ++i) { add("size",INT_T,expr_type(INT_T,i)); add("size",INT_T,expr_type(DOUBLE_T,i)); add("size",INT_T,expr_type(VECTOR_T,i)); add("size",INT_T,expr_type(ROW_VECTOR_T,i)); add("size",INT_T,expr_type(MATRIX_T,i)); }

// dims() is polymorphic by size for (size_t i = 0; i < 8; ++i) { add("dims",expr_type(INT_T,1),expr_type(INT_T,i)); add("dims",expr_type(INT_T,1),expr_type(DOUBLE_T,i)); add("dims",expr_type(INT_T,1),expr_type(VECTOR_T,i)); add("dims",expr_type(INT_T,1),expr_type(ROW_VECTOR_T,i)); add("dims",expr_type(INT_T,1),expr_type(MATRIX_T,i)); }

So, the args are equal, with only the return type being different.

Do you mind I open this issue branch to correct this to:

int[] dims(T[] x) Returns an integer array containing the dimensions of x; the type of the argument T can be any Stan type with up to 8 array dimensions. int size(T[] x) Returns the number of elements in the array x; the type of the argument T can be any Stan type with up to 8 array dimensions.

— Reply to this email directly or view it on GitHub.

randommm commented 10 years ago

How this, kind of following R syntax for multiple args: \begin{description} \fitem{int[]}{dims}{\farg{T}[...] \farg{x}}{Returns an integer array containing the dimensions of \farg{x}; the type of the argument \farg{T} can be any Stan type with up to 8 array dimensions.} % \fitem{int}{size}{\farg{T}[...] \farg{x}}{Returns the number of elements in the array \farg{x}; the type of the argument \farg{T} can be any Stan type with up to 8 array dimensions.} \end{description}

bob-carpenter commented 10 years ago

Neither of those is quite right. The two functions take very different kinds of arguments and this makes them look the same. Note the different loop initial values in function_signatures.h.

dims() takes any type of argument, not just arrays, whereas writing [...] suggests that it has to be an array.

size() takes only arrays, and returns their size.

So how about

int size(T[] x)
Return the number of elements in the array x.  x can have at most an 8-dimensional array.

and

int[] dims(T x)
Return the array of dimensions of the argument x in the order they are used for indexing.  
The argument can be any Stan type, with up to 8 array dimensions.

I'm just afraid the qualifications are going to be confusing. If users have more than 8 dimensions, they have bigger problems than size() and dims() not working.

I also just looked at the implementation and see that it's wrong in cases where arrays are of size zero. This is going to require some metaprogramming to crawl our way out of.

bob-carpenter commented 10 years ago

Actually, there's no way to do that calculation, even with metaprogramming. So, to make the definition precise of what does happen, it terminates the list of dimensions with the first size-0 dimension. So a variable x declared as

real x[2,0,3];

will have dimensions {2, 0} as a return value. So the definition is even more involved.

int[] dims(T x)
Return the array of dimensions of the argument x in the order they are used for indexing.  
The argument can be any Stan type, with up to 8 array dimensions.  If any dimension of `x` is size 0, that will be the last dimension reported.

And now I think examples are called for of declarations and what they return. In the form of a table.

\begin{tabular}{lcc}
{\it Declaration}    & \code{size()} & \code{dims()}
\\
\code{int x} &  n/a & $( \, )$
\\
\code{real x} & n/a & $( \, )$
\\
\code{int x[3]} & 3 & $( 3 )$
\\
\code{real x[4,5]} & 4 & $(4, 5)$
\\
\code{vector[2] x} & n/a & $( 2 )$
\\
\code{row_vector[8] x[6,7]} & 6 & $(6, 7, 8)$
\\
\code{cov_matrix[7] x} & n/a & $(7, 7)$
\\
\code{matrix[11,12] x[9,10]} & $(9, 10, 11, 12)$
\end{tabular}

Feel free to add more examples if you don't think this is clear enough! If there is a version of dims() i the matrix section, it should be cross-referenced both ways to the examples.

bob-carpenter commented 10 years ago

I forgot to add the critical examples!

\code{int x[3,0]} & 3 & $(3,0)$
\\
\code{int x[0,3]} & 0 & $(0)$
\\
\code{vector[3] y[4,0,2]} & 4 & $(4,0)$

Let's edit this all down to just the recommendations in the end so I don't get lost making the next version of the manual.

randommm commented 10 years ago

Wow, that was way more complex in the end. Fell free to delete my posts here (including this one) to debloat the screen.

bob-carpenter commented 10 years ago

[x] clarify that multiply_lower_tri_self_transpose applies to non-square matrices

betanalpha commented 10 years ago

[x] Remove unnecessary backslashes from Bash example of running parallel chains near the beginning of "Running a Stan Program"
[x] add note about putting it in a script rather than just entering on command line to avoid these issues

bob-carpenter commented 10 years ago

Linas Mockus on stan-users suggests:

[x] fix the loop bounds

After trying neural network kernel I figured out that there is a minor typo in the code:

for (i in 1:(N-1)) {
 for (j in i:N) {

should be replaced with:

for (i in 1:(N-1)) {
  for (j in (i+1):N) {

bob-carpenter commented 10 years ago

You need to add c:\Rtools\bin to your PATH environment variable and then open a new command window.

Ben says: We need to emphasize this in the installation instructions for Windows. Rtools has an option at the end to edit the path, but it is unchecked by default and requires administrative privledges to actually do it.

[x] emphasize this in a section with an example of how to test it's installed properly

[There's a pull request to deal with this now.]

bob-carpenter commented 10 years ago

Aki Vehtari pointed out a bug in our doc for epsilon(). The doc should follow numeric_limits<double>::epsilon() in C++, namely,

Machine epsilon (the difference between 1 and the least value greater than 1 that is representable).

[x] move this to issue #650 because the function's no longer in function_signatures.h so it's not just a doc issue

bob-carpenter commented 10 years ago

Thanks. We have very observant readers! You're the second person to point that out and I believe it's already been fixed on the new manual branch (which is why it's not on the to-do list).

Bob

On Mar 1, 2014, at 4:21 PM, HerraHuu notifications@github.com wrote:

Small typo on p. 345:

y ~ lkj_corr_log(eta); Increment log probability with lkj_corr_log_log(y,eta), dropping constant additive terms; Section 24.3 explains sampling statements.

Should be just lkj_corr(eta) and later on lkj_corr_log(y,eta)?

— Reply to this email directly or view it on GitHub.

bob-carpenter commented 10 years ago

[x] Explain clearly in manual that for beta and Dirichlet distributions that boundary variates won't work due to the form of the distribution Specifically, no 1 ~ beta(a,b) or 0 ~ beta(a,b) or any v ~ dirichlet(alpha) with v having a 0 component.

bob-carpenter commented 10 years ago

From Guido Biele on stan-users:

[x] Change size of vector beta from N to K in vectorization example
[x] thank Guido in acknowledgments

Example from page 80 of v2.2.0:

data {
 int<lower=0> N; // number of data items
 int<lower=0> K; // number of predictors
 matrix[N,K] x; // predictor matrix
 vector[N] y; // outcome vector
}
parameters {
 real alpha; // intercept
 vector[N] beta; // coefficients for predictors
 real<lower=0,upper=10> sigma; // error scale
}

bob-carpenter commented 10 years ago

[x] update comment on Omega to read "correlation" not "covariance" on page 95

bob-carpenter commented 10 years ago

[x] remove references to floor and ceiling used as indices, because they only return reals

bob-carpenter commented 10 years ago

From Andrew:

I was searching for the list of all the data types (to find out how to declare a covariance matrix) and I found it in section 22.4. I wonder if this is important enough to put it earlier, in a chapter right after the current chapter 8 and right before the current chapter 9?

[x] add discussion of list of data types into programming section

bob-carpenter commented 10 years ago

From Andrew:

n the manual there is a distinction between "positive continuous distributions" vs. "non-negative continuous distributions.” But in Stan there is no difference, right? If we have a parameter defined on [0,infinity), we first take the log so it never gets to 0 anyway? Or would it make sense in that scenario to define the distribution symmetrically to be positive or negative, so that the sampler moves smoothly past 0?

Not quite; the Rayleigh really does allow 0 values. I want to distinguish these as we're doing.

[?] clarify this going forward

bob-carpenter commented 10 years ago

[x] update models for bernoulli_logit(alpha) vs. bernoulli(inv_logit(alpha))
[x] replace loops with vectorization

bob-carpenter commented 10 years ago

Also from Andrew:

[x] On p.341, we have “multi_norm_prec”. Is this a typo? I assume it should be “multi_normal_prec”?

HerraHuu commented 10 years ago

Small typos on p. 317: y ~ multinomial(theta,N) real multinomial_log(int[] y, vector theta, int N)

Should be: y ~ multinomial(theta) real multinomial_log(int[] y, vector theta)

[x] fix

bob-carpenter commented 10 years ago

From Andrew:

I was playing around with a simple linear regression with 5 data points and 3 predictors (so the posterior for the betas is t_2, which has long tails and infinite variance. After the default 2000 iterations, Stan didn't seem too far from convergence (R-hat was 1.0 for the betas, 1.1 for sigma, and 1.4 for lp__) but the estimates for the parameters (mean, se_mean, sd) were all over the map. At first I was freaked out, then I remembered that the posterior distribution has infinite variance so these aren't such great summaries!

The model is

data {
 int N;
 int K;
 vector[N] y;
 matrix[N,K] X;
}
parameters {
 vector[K] b;
 real<lower=0> sigma;
}
model {
 y ~ normal (X*b, sigma);
}

[ ] add section to chapter on problematic posteriors dealing with this case
[ ] in the manual when we introduce the default bin/print display, we add a sentence in a footnote explaining that the mean, se_mean, and sd can't be taken seriously when the posterior variance is infinite.

bob-carpenter commented 10 years ago

[x] Fix arXiv citations to be in the usual N.M format rather than the current N(M) format

aadler commented 10 years ago

Let me take a stab at that; I think there are only four in the bib.

bob-carpenter commented 10 years ago

[x] Fix parameterization of bernoulli_logit to have the proper parameterization with inv_logit(alpha) being the chance of success, not exp(alpha).

bob-carpenter commented 10 years ago

For all of the distributions

[ ] name the parameters (scale, shape, location, etc.)
[ ] add means and variances
[ ] add log scale formula
[ ] add log scale derivatives

bob-carpenter commented 10 years ago

Matt Wand sent us these comments via Andrew off list:

[x] On page 346, the expression for the Wishart density function is missing a minus sign in the exponent.
[x] On page 357, the power on the |W| may be incorrect. In Table A.1 on page 475 of the 1995 version of Gelman, Carlin, Stern & Rubin the expononent is -(nu + k + 1)/2.
[x] Thank Matt int he acknowledgements

betanalpha commented 10 years ago

[x] Bulk up the descriptions of the sampler parameters. For n_divergent,

"Stan uses a symplectic integrator to approximate the exact solution of the Hamiltonian dynamics and when the step size is too large relative to the curvature of the log posterior this approximation can diverge and threaten the validity of the sampler. n_divergent counts the number of iterations within a given sample that have diverged and any non-zero value suggests that the samples may be biased in which case the step size needs to be decreased. Note that, because sampling is immediately terminated once a divergence is encountered, n_divergent should be only 0 or 1. "

bob-carpenter commented 10 years ago

In the description of arguments, id is too vague now in the Stan manual (soon to be CmdStan manual). We need to describe how it advances the PRNG more explicitly. There's a bit of discussion up front, but none on the description of the actual argument.

[x] clarify use of id to advance PRNG in Stan manual as well as in new CmdStan manual

bob-carpenter commented 10 years ago

[x] note that && doesn't short circuit

mitzimorris commented 10 years ago

[x] In the Regression section, discuss whether or not to model the intercept independently from coefficient vector of betas.

bob-carpenter commented 10 years ago

[x] remove CmdStan doc
[x] cross-ref doc for PyStan, RStan, CmdStan

bob-carpenter commented 10 years ago

[x] noted that von Mises only normalizes if restricted to interval of 2*pi and that the interval should be chosen so the posterior is unimodal

bob-carpenter commented 10 years ago

[x] added Marco Inacio to list of dev team members in preface

bob-carpenter commented 10 years ago

[x] added note that copyright is owned separately by each contributor or their assignee

bob-carpenter commented 10 years ago

[x] remove "experimental" warning on optimizers

stan-dev / stan

next manual, 2.3.0 #567