next manual, 2.8.0++ - Githubissues

bob-carpenter commented 8 years ago

This is where updates for the manual for the next release go if they are not related to a pull request (new features, bug fixes, etc.)

bob-carpenter commented 8 years ago

[x] Add Krzysztof Sakrejda to list of developers (functions, Stan Math, Stan, C++)
[x] make it clear which repos/modules each developer is contributing to

bob-carpenter commented 8 years ago

Julian King points out:

p(y | lambda) = Poisson(y | lambda) / (1 - PoissonCDF(0 | lambda))
so log p = log_Poisson(y | lambda) - log(1 - PoissonCDF(0 | lambda))
 = log_Poisson(y | lambda) - log(1 - exp(-lambda)
 = log_Poisson(y | lambda) - log1m_exp(-lambda)

[x] Update zero-inflated/hurdle section (p. 109 now) to note a further reduction
[x] cite Julian King for pointing this out (and add to acks)

bob-carpenter commented 8 years ago

[x] thank Ashley Ford for reporting a bug on test-headers
[x] thank Evelyn Mitchell for fixing doc

bob-carpenter commented 8 years ago

[x] pushed off until next release: https://github.com/stan-dev/stan/issues/1709#issuecomment-161174808

From Joachim Vandekerckhove on stan-users:

Sorry I'm a little late to this party, I had a long summer of travel. The reason we originally implemented the lower-bound hits as negative reaction times was to have a cheap way of encoding what is essentially a bivariate distribution (strictly positive RT and binary response). Because both the RT and the outcome are random variables, the current implementation (with only one bound) isn't entirely satisfactory if at any point we want to generate random numbers from the distribution.

I always liked the negative-RT solution even though it's a little counter-intuitive. I'd probably prefer it in Stan since it's how it works in JAGS... but the more conventional solution would be to make it actually bivariate, and let it operate on a [RT response] pair, where the first is a strictly positive real and the second is a boolean.

bob-carpenter commented 8 years ago

[x] fix broken URL links in manual: http://mc-stan.org/examples.html
- [x] link to http://mc-stan.org/documentation instead

Howard Zail points out on stan-users:

Most of the hyperlinks in the Stan Manual seem to be broken. In particular, I am looking for the http://mc-stan.org/examples.html page. This link is also broken on the mc-stan website.

bob-carpenter commented 8 years ago

[x] create issue in proper repo: https://github.com/stan-dev/example-models/issues/35

Organize examples from manual on stan-dev/example-models better

This is really going to have to be done on that repo, I think, because they're no longer part of the manual directory itself.

bob-carpenter commented 8 years ago

José Rojas Echenique reports:

Section 1.6 (p25) and 1.8 (p26) are both titled Variational Inference.

[x] They should be merged. Or 1.6 should be removed (1.8 is more detailed).

Also,

[x] thank José in the acknowledgments

( moved here from #1629 )

bob-carpenter commented 8 years ago

Ashley Ford reported in another issue: https://github.com/stan-dev/stan/issues/1637

My understanding now is that the actual integration time is

stepsize * round(int_time / stepsize)

usually slightly less than int_time

[x] add above to doc on algorithm parameters for HMC
[x] thank Ashley Ford

bob-carpenter commented 8 years ago

[x] merge the two tables in figure 24.2 with caption "The table shows the variable declaration types of Stan and their corresponding primitive..."

randommm commented 8 years ago

[x] Change my affiliation from "University of São Paulo" to "University of São Paulo/UFSCar"

maciekjswat commented 8 years ago

There is a typo in the manual 2.8.0, section '48.6. Multivariate Student-t Distribution', in the PDF formula It is: \Gamma \chi ((... it should be: \Gamma ((...

[x] fix

bob-carpenter commented 8 years ago

[x] thank Miguel de Val-Borro for a doc patch

bob-carpenter commented 8 years ago

[x] thank Bruno Jacobs for doc patch

bob-carpenter commented 8 years ago

From Andre Pfeuffer on stan-users:

Vectorized MA(Q) model in the manual should be model { eta ~ normal(0, sigma); }.

Also see: https://github.com/stan-dev/stan/issues/485

[x] I'm marking this one as reviewed --- I don't see how it relates to the actual example, which looks OK to me.

bob-carpenter commented 8 years ago

[x] change links to mc-stan.com/examples to /documentation or just remove them altogether

bob-carpenter commented 8 years ago

[x] fix section 25.2 to say that it is not allowed to increment the log density in the transformed parameters block (this restriction may be relaxed in the future---see #889.

bob-carpenter commented 8 years ago

[x] update ADVI notation to match rest of manual
- [x] normal to match rest of manual in parameterization and naming for density
- [x] remove semicolons in density notation and replace with vbar

bob-carpenter commented 8 years ago

[x] add section to optimization chapter section on sufficient stats mentioning that going from data to aggregated data will help in terms of vectorization
[x] thank John Hall for bringing this up on the users list

bob-carpenter commented 8 years ago

From Ryan Batt on issue #1691:

Should y[2,4] = 0 instead be y[2,4]=1?

Page 136 of Manual v2.80, the very end of the caption for Figure 12.1. It's about the data base representation of a sparse matrix.

[x] check and fix if necessary
[x] thank Ryan in the acknowledgements

bob-carpenter commented 8 years ago

From Ryan Batt on issue #1692:

At the top of page 138 of manual 2.80, The 6th element of z should be 12.9 (not 129):

[x] fix this

Also, I just realized this is Figure 12.2.

[x] Furthermore, in the caption for the figure, "With this coding, In this particular ...", the letter " i " should not be capitalized in "In".

bob-carpenter commented 8 years ago

[x] fix csr_to_dense_matrix argument types in function spec; argument w should be matrix

rBatt commented 8 years ago

Page 224 of the manual, section 21 (Reproducibility). The word "on" is repeated twice in the following:

It doesn’t matter if you use a stable release version of Stan or the version with a particular Git hash tag. The same goes for all of the interfaces, compilers, and so on on.

This is the first paragraph after the list.

[x] Delete duplicate "on"

botanize commented 8 years ago

Page 84:

The data is declared in the same way as the other time-series regressions. Here the are parameters for the mean output mu and error scale sigma, as well as regression coefficients phi for the autoregression and theta for the moving average component of the model.

It's unclear what the purpose of this paragraph is, it seems to point out the obvious. The second sentence is particularly unclear and at a minimum suffers from a typo ("the are").

[x] fix it

jonathan-g commented 8 years ago

p. 37, definition of cholesky_factor_corr is unclear. "length of each row is 1." Length so commonly refers to the number of elements that maybe it would be good, for clarify, to say instead that each row is a unit vector.

[x] add clarification

rBatt commented 8 years ago

Section 5.8 "Hierarchical Logistic Regression", last line on page 56:

... an approach would no pooling assigns each level l its own coefficient ...

"would" should be changed to "with", I believe.

[x] change "would" to "with"

jonathan-g commented 8 years ago

In the ARMA(1,1) models (p. 84): MA and ARMA models are not identifiable if the roots of the characteristic polynomial for the MA part lie inside the unit circle, so it's necessary to add the constraint

real<lower = -1, upper = 1> theta;

When I run the model as it appears in the manual, without the constraint, using synthetic data from arma.sim, the simulation can sometimes find modes for (theta,phi) outside the [-1,1] interval, which creates a multiple mode problem in the posterior and also causes the NUTS treedepth to get very large (often > 10). Adding the constraint both improves the accuracy of the posterior and dramatically reduces the treedepth, which speeds up the simulation considerably (typically by much more than an order of magnitude).

Further, unless one thinks that the process is really non-stationary, it's worth adding an additional constraint

read<lower = -1, upper = 1>phi;

to ensure causality (stationarity).

[x] added as a new section with attribution to Jonathan Gilligan

bob-carpenter commented 8 years ago

David Manheim on stan-users suggests:

[x] add warning to optimization section up front indicating model misspecification may be a culprit
[x] thank David

brendan-r commented 8 years ago

Very minor:

The parameter for segment on a row_vector is v in the specification, and rv in the description.

bob-carpenter commented 8 years ago

@brendan-R: thanks, I'm moving this to 2.9.0++ since 2.9.0 already got tagged.

stan-dev / stan

next manual, 2.8.0++ #1617