Open gil2rok opened 1 year ago
That's a good point and we should clarify.
theta
with density p(theta | y)
, where y
is data.rho
of same dimensionality as theta
and give rho
an independent normal(0, Sigma)
distribution where Sigma is either diagonal or dense.p(theta, rho | y, Sigma) = p(theta | y) * normal(rho | 0, Sigma)
.(rho, theta)
p(theta, rho)
is called a "draw". A sequence (or multi-set) of draws is called a "sample." The literature often overloads "sample" to also refer to a draw, but let's try to be precise and consistent in our doc.p(theta)
. In general with (MC)MC methods, if we have a draw (theta(m), rho(m)) ~ p(theta, rho),
where p(theta, rho) propto exp(-H(theta, rho))
, then theta(m)
is a draw from p(theta)
.(theta(m), rho(m))
is a draw from the joint position/momentum distribution and theta(m)
is a draw from the position distribution and hence a draw from the target distribution.In general, the internal data structures are not relevant for doc and should not be referenced in the doc. That's a developer/programmer detail. What we want to document is the client-facing API. That allows us to fix an API, document it and put tests in place, then later refactor it without changing doc or tests.
draw from the joint position/momentum distribution
Depending on our desired level of pedantry, some call this joint distribution the canonical distribution.
The term "canonical distribution" is how the physicists refer to it. The other physics term that's relevant is "phase space," which refers to the coupled (theta, rho)
variables. I don't think using either would be helpful for our doc, but I'm also not opposed if you want to mention that's what physicists call these things.
Per my discussion with @WardBrian , in HMC a draw can refer to two distinct concepts:
theta
, which denotes a draw from the target distribution(theta, rho)
that contains position and momentum variables accordingly, denoting a draw from the joint distribution.In
Bayes-Kit
, the data typeDrawAndLogP
suggests definition 1 when it is defined here. However, comments from @bob-carpenter in my pull request here suggest definition 2. Additionally, Stan documentation uses the phrase "draw" in HMC to refer to definition 1 here.My code for the
drghmc
sampler uses theDrawAndLogP
data type, suggesting definition 1, but in its documentation uses a draw to refer to definition 2. I worry this may be confusing.Lastly, if Bayes-Kit is intended as a pedagogical resource, it may be especially worthwhile to clarify this confusion. Would love to hear other's thoughts on this!