Closed bob-carpenter closed 7 years ago
HMM Viterbi example problem in examples.tex
.
Originally reported in https://github.com/stan-dev/example-models/issues/31#issuecomment-309627744
programming.tex line 185 of the reference manual v2.16.0 I believe it only makes sense if "kappa theta" becomes "kappa phi" instead, especially since alpha is a k-sized vector.
in the GP section "In this case, an inverse gamma, inv_gamma_lpdf in Stan’s language, will work well as it has a sharp left tail that puts negligible mass on length-scales, but a generous right tail, allowing for large length-scales."
needs an adjective to describe length-scales in the first clause. Perhaps: "In this case, an inverse gamma, inv_gamma_lpdf in Stan’s language, will work well as it has a sharp left tail that puts negligible mass on infinitesimal length-scales, but a generous right tail, allowing for large length-scales."
OK, I'll fix that. I'll mention that gamma, inverse gamma and lognormal are all "zero avoiding" in the sense that the limit of the density at zero is zero and estimates near zero will be pushed away. Then I'll cross-reference the discussion in the regression chapter, which cites Andrew's and Vince's papers.
int_step()
saying that it differs in behavior at 0 from step()
step()
In the GP section on page 247 and 255, the example code multiplies by 1/2. It seems like this would just round to 1 given integer division. Everywhere else in the manual uses multiplication by 0.5 instead.
Also in the GP section, I think that rho ~ gamma(4,4) should be rho ~ inv_gamma(4,4). The text refers to the benefits of the inverse gamma distribution, so the example code should use that as well.
I think there is a similar issue with the description of the generalized inverse gaussian. The manual says that the GIG has a Gaussian right tail, but actually it has an inverse Gaussian right tail.
Thanks, @aaronjg --- if (1/2)
is a subexpression, that will evaluate to 0; we just follow C++ evaluation because we literally translate it to the same expression, 1 / 2
in C++.
I like the idea of defining the Bayesian posterior for R2, defined by @bgoodri in a response on StackOverflow: https://stackoverflow.com/questions/44759319/overall-predictive-power-e-g-r2-for-bayesian-linear-mixed-models
@bob-carpenter Thanks, I just submitted a pull request for the 1/2 issue. I didn't change the other inv_gamma/gamma thing because I'm not sure if the prose or the model formulation is correct (or if I'm just missing something here).
@aaronjg Thanks. If you want to just leave the comments, I make a pass every release to fix all the ones noted (or explain why they can't be fixed or won't be fixed until later). Our pull requests are pretty heavy with testing and review for small changes these days.
From @stemangiola on stan-dev/stan#2315 about p. 193 of 2.16 manual:
In mixture models
real log_theta[K] = log(theta); // cache log calculation
should be replaced by
vector[3] log_theta = log(theta); // cache log calculation
Otherwise gives error.
Turns out there's more to clean up. @lukasvermeer pointed out more issues at https://github.com/stan-dev/stan/issues/2315#issuecomment-312090660
ordered mu[K];
K
in wrong place@lukasvermeer suggested
ata {
int<lower=1> K; // number of mixture components
int<lower=1> N; // number of data points
real y[N]; // observations
}
parameters {
simplex[K] theta; // mixing proportions
ordered[K] mu; // locations of mixture components
vector<lower=0>[K] sigma; // scales of mixture components
}
model {
vector[K] log_theta = log(theta); // cache log calculation
sigma ~ lognormal(0, 2);
mu ~ normal(0, 10);
for (n in 1:N) {
vector[K] lps = log_theta;
for (k in 1:K) {
lps[k] = lps[k] + normal_lpdf(y[n] | mu[k], sigma[k]);
}
target += log_sum_exp(lps);
}
}
In section 4.1, it would help to add another real literal example indicating that scientific notation with a "+" is valid. Specifically, could an example like "1.23e+3" be added?
1.23e+3
example literalIn section 24.1 version 2.16, a couple of lines in the softmax_id function have some typos:
alpha[num_elements(alphac)] = 0;
return softmax(alphac);
should be
alphac1[num_elements(alphac1)] = 0;
return softmax(alphac1);
Section 38, "Void Functions" - The main text references two functions, but only one is discussed later on. It looks like the section for increment_log_prob was removed, but the overview was not updated. I think 'reject' should also be in this section.
There are a few references to 'google groups' that should be update to reflect the move to discourse.
Sec. 26.5 Matrices Parameters and Constants - it looks like there is a typo and 'idx[7,' should be 'idxs[7]'
idxs[7, 2]
new section on GPs has footnote referencing URL "mc-stan.org/documentation" which is 404.
also, don't understand first example in GP section - explain logic for assignments to row N of covariance matrix?
furthermore, footnote mentions that program implementing the marginal likelihood GP is in example models - but it isn't.
Thanks for all your team's great work on Stan! A couple of things for you:
[x] Section 3.4, "Positive, Ordered Vectors" section, PDF page 41, sentence missing words. I think it should be (missing words asterisked):
Like ordered vectors, after their declaration positive ordered vectors *may be* assigned
to other vectors and other vectors may be assigned to them.
[x] Section 10.4, last line on top of PDF page 170 has typo ("read" instead of "real"). Should be:
real<lower = -1, upper = 1> phi;
Thanks, @treysp, I'll fix those.
Explain the Ben RStanArm trick of
data {
int<lower=0, upper=1> include_alpha;
...
parameter {
vector[include_alpha ? N : 0] alpha;
It'll work with all types other than simplexes (have to verify that for correlation/covariance types).
Example code in 'reparameterization' sections should use the new combined declaration and assignement syntax.
Add @bgoodri's definition of the bivariate normal CDF:
real binormal_cdf(real z1, real z2, real rho) {
if (z1 != 0 || z2 != 0) {
real denom = fabs(rho) < 1.0 ? sqrt((1 + rho) * (1 - rho)) : not_a_number();
real a1 = (z2 / z1 - rho) / denom;
real a2 = (z1 / z2 - rho) / denom;
real product = z1 * z2;
real delta = product < 0 || (product == 0 && (z1 + z2) < 0);
return 0.5 * (Phi(z1) + Phi(z2) - delta) - owens_t(z1, a1) - owens_t(z2, a2);
}
return 0.25 + asin(rho) / (2 * pi());
}
Ben added:
if rho = 1
, then the bivariate CDF is min(Phi(z1), Phi(z2))
and if rho = -1
, it is Phi(z1) + Phi(z2) - 1
.
cov_matrix
in a transformed data (not so costly) or transformed parameters block is to just use matrix
and skip the cubic algorithm to validateThanks for all the great work around stan!
Just bumped into this today: Page 143, Multilevel 2PL Model:
And as a stretch goal,
K - 1
parameterizations, use append_row(..., 0)
to construct the K
-vector of linear predictorsdata {
vector[J] x[N]; // predictors for component membership
...
parameters {
matrix[K - 1, J] beta; // mixture regression coeffs
...
model {
for (n in 1:N) {
vector[K] lp = softmax(append_col(beta * x[n], 0));
for (k in 1:K)
lp[k] += normal_lpdf(eta[n] | mu[k], sigma);
target += log_sum_exp(lp);
}
...
A commenter with non link named "Alex" pointed out on Gelman's blog (http://andrewgelman.com/2017/08/21/mixture-models-stan-can-use-log_mix/#comment-554501) that there's an extra right paren in
target += log_mix(lambda, normal_lpdf(...), normal_lpdf(...)));
[x] remove extra right paren in example
[x] assume the thumbs up came from the ame Alex and thank Alex Perrone
I released 2.17.0 without this because it wasn't mentioned as holding up the release, but we can update the manual independently if you like.
Thanks. I kept thinking the release was imminent and I would be on vacation, then forgot that we hadn't done 2.17 yet.
It shouldn't hold up the release. After 2.17, we should just update the name of the issue to "next manual, 2.18".
I want to start moving the manual over to bookdown format so we can put it on the web to make it searchable. It's just too painful to search the pdf format. But then we'll have some issue of stability of where we put it if we want any Google juice to help direct people to the appropriate bits.
append_array
function doc to indicate that the max order is 7, not 8.I think there's a typo on page 218 in Vers 2.16 (the Cormack-Jolly-Seber model). In the table, should the probability for profile 3 read \phi_2 p_3, instead of \phi_2 \phi_3 ? That seems to make sense, and corresponds to the model below as well.
src/docs/stan-reference/distributions.tex
, line 121:
-\int_{-\infty}^y p(y \, | \, \theta) \ \mathrm{d}\theta.
+\int_{-\infty}^y p(y \, | \, \theta) \ \mathrm{d}y.
The manual is not clear as to where conditional statements are allowed: as the current text doesn't mention restrictions, I thought that conditionals could be used in the data section, which is not true.
@mcol No statements are allowed in the data section. Might you be thinking about the conditional operator (cond ? x : y
)? That should be allowed as long as none of the expressions cond
, x
, or y
involve anything other than data variables, which they couldn't in the data block anyway.
confirmed - this compiles:
data {
int<lower=1> a;
int<lower=1> b;
int c[a > b ? a : b];
}
My point is that in reading the part on conditional statements (section 5.5) and most of the manual up to there, I haven't seen a clear definition as to where these can or cannot be used. Maybe this is a consequence of the fact that program blocks are introduced only later (chapter 6), and it would be enough to forward reference table 6.1 from the earlier sections.
excellent point and thanks for the feedback, it's most valuable. agreed that more overview/context would be useful.
Add clutter example to mixture chapter as an example of "denoising" (it's an example in Bishop's book (section 10-7.1)
data {
real<lower = 0, upper = 1> theta; // clutter ratio
int<lower = 0> N;
vector[N] y;
}
parameters {
real mu;
}
model {
for (n in 1:N)
target += log_mix(theta,
normal_lpdf(y[n] | mu, 1),
normal_lpdf(y[n] | 0, 10));
}
theta <- 0.5
N <- 200
mu <- 4.3
y <- rep(0, N);
for (n in 1:N) {
if (rbinom(1, 1, 0.5)) {
y[n] <- rnorm(1, mu, 1)
} else {
y[n] <- rnorm(1, 0, 10)
}
}
library(rstan)
fit <- stan("clutter.stan", data = list(theta=theta, N=N, y=y))
transformed data {
vector[4] x = [ 1, 2, 3, 4 ]';
vector[4] u = x;
for (t in 2:4)
u[t] = u[t - 1] * 3;
x[2:4] = x[1:3] * 3;
print("u = ", u);
print("x = ", x);
}
which produces
u = [1,3,9,27]
x = [1,3,6,9]
The code in 14.1 (Regression with measurement error) on page 202 does not compile, and I think should be,
vector[N] x;
vector[N] y;
which works.
Include Ben's discussion of the "Lancaster" parameterization of multinomial in terms of Poissons:
http://discourse.mc-stan.org/t/large-poisson-model-with-individual-effects-is-too-slow/2112/2
If people don't have Lancaster's book, these reparameterizations are talked about in his papers at http://www.econ.brown.edu/Faculty/Tony_Lancaster/ . Both the "Incidental Parameters Problem since 1948" and the "Orthogonal Parameters and Panel Data".
On Mon, Oct 9, 2017 at 7:07 PM, Bob Carpenter notifications@github.com wrote:
- Include Ben's discussion of the "Lancaster" parameterization of multinomial in terms of Poissons:
http://discourse.mc-stan.org/t/large-poisson-model-with- individual-effects-is-too-slow/2112/2?u=bob_carpenter
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/stan-dev/stan/issues/2336#issuecomment-335313592, or mute the thread https://github.com/notifications/unsubscribe-auth/ADOrqv3iusbzFEXpxqDd3VHd2GkhKW00ks5sqqc2gaJpZM4N--Ak .
[x] Fix real x[]
-> real[] x
in Section 41.1. and Index
or should these be reals?
[x] also fix elsewhee
[x] thank Jan Gleixner for stan-dev/stan#2423
[x] add space to Jan's fix to the BNF for print/reject
mining-disasters
)From a side comment on stan-dev/stanc3#1403:
numeric_literal
with real_literal
in BNFnumeric_literal
definition$\lambda_1 + q, \lambda_2 - q$
not $\lambda_1 + q, \lambda_1 - q$
.The Stan's Future section in the Preface (preface.tex
lines 247-250) are duplicates of what is in the previous section Stan 2 and can probably be removed.
As a minor formatting issue, in the Stan Interfaces section of the introduction (introduction.tex
lines 69, 80, etc.), some interfaces are specified as \subsection
(such as CmdStan, RStan, and PyStan) while others are \subsubsection
(such as MatlabStan, Stan.jl, StataStan, and MathematicaStan). I'm not sure if this is a historic thing (the first being the original interfaces and the later being more recent interfaces that wrap CmdStan) or a typo but it's not clear.
From a conceptual standpoint, section 2.1 Character Encoding is somewhat underspecified. I am far from an expert but it was my understanding that it is impossible to infer the encoding from a character stream (see https://www.youtube.com/watch?v=ysh2B6ZgNXk for far many scary details). So it should be valid to say that all Stan programs will be interpreted as being ISO-8859-1 (since 8-bit ASCII isn't a real thing and the file is being read in byte-by-byte) with only 7-bit ASCII characters being valid in the content of the Stan program and comments being ignored (but treated as 8-bit characters when looking for newlines in src/stan/io/read_line.hpp
).
Thanks, @enbrown.
I'll remove the redundancy. I'm about to do a major re-og on the doc and some of the preface issues will go away. I'll try to make the interface description more specific.
Indeed, it's not generally possible to infer character encodings. Under the hood, we just use the standard I/O streams to read char
(8-byte) values in C++.
Maybe this'll be a clearer way to say what's going on, because it's a bit non-standard:
That defines everything but the content of comments. So you can use ISO-8859-1(aka Latin-1) or the other ISO-8859 variants or you can use the UTF-8 encoding of unicode. That's because they share the ASCII code points. You still won't be able to use anything other than the ASCII code points (bytes 0 to 127) for identifiers. Comments can thus contain any sequence of bytes you want other than newline in line comments and "*/" in block comments (those will end the comment sequence).
[x] change "sample" to "draw" in description of generated quantities
[x] thank Jonathan Sweeney for reporting
Originally reported here: http://discourse.mc-stan.org/t/specifying-the-number-of-samples-for-rng/2384/2
Summary:
This is the issue for suggesting fixes for the Stan manual. Please just add suggestions as comments rather than opening new issues.
Current Version:
v2.16.0