Closed yebai closed 7 years ago
Here is a summary of the naming scheme we've discussed. One simple solution is to use the following:
varname_idx1_idx2
Here varname
is the name of a variable, e.g.x
, m
, or s
; moreover idx1
, idx2
etc are the indices for multiple dimensional variables. For scalar type variables, the name would simplify to varname
.
There is one potential drawback with this naming scheme. To illustrate this, let's first start with an (arbitrary) example
gaussdemo() = begin
s ~ InverseGamma(2,3)
m ~ Normal(0,sqrt(s))
for i=1:length(x)
x[i] ~ Normal(m, sqrt(s))
w ~ Normal(m, sqrt(s)) # definition 1 of var w
end
w ~ InverseGamma(2, 3) # definition 2 of var w
return(s, m, x)
end
In the above example, variable w
is declared twice in difference places. Because Julia is a dynamic language, this is fine. However this makes it difficult for us to distinguish these two w
's. In order to implement the compositional sampler interface and report results to the user, we need a consistent naming scheme. There are two possible solutions
Detect such variable redefinitions and report an error. This explicitly requires the user to avoid using the same name for multiple different random variables. However, using arrays is still fine.
Add a special notation (e.g. through calling gensym
) to the naming scheme. More specifically, the variable name would look like varname_uniqueid_idx1_idx2
. The difficulty, however, is to add meaningful notations so that the user can tell which is which by looking at these variable names.
EDIT (24 Jan 2017)
After some discussion, we've decided to pick option 2. The solution to clashed variable names is:
1) Rename variables internally (e.g. storing w
with different names, e.g. w1
w2
in VarInfo
) following Wingate (2011).
2) When assigning variables to different samplers, symbol w
will select all variables with the same name w
(e.g. w1
and w2
are both sampled by the same sampler).
EDIT (14 Mar 2017)
It's probably better to use the following updated internal variable naming scheme (extracted form issue #96, link):
x[1,2,3] ==> x[i]
where i
is linear index corresponding to [1,2,3]
(e.g. sub2ind(dims, i, j, k...)
-> index)
x[1][2][..]
==> x[i][j][..]
(the nesting could continue N
times)for i=1:N, x ~ Normal(0,1); end
f(x) = if x > 1, f(x-1); else x ~ Normal(0,1); end
f() = begin x ~ Normal(0,1); x ~ Bernoulli(0.5) end
, because variable x
is defined twice at different places.Detect such variable redefinitions and report an error. This explicitly requires the user to avoid using the same name for multiple different random variables. However, using arrays is still fine.
[x] 1.1 Support new naming scheme
[x] 1.2 Remove @assume
@observe
and @predict
- just keep ~
[x] 2.1 Implement the composition interface, i.e.
translate mumtiple sampler passed
varInfo = sample(gaussdemo, data = Dict(:x=>[1 2]), varInfo, HMC(..., :m), PG(..., :s))
to
varInfo = sample(gaussdemo, data = Dict(:x=>[1 2]), varInfo, HMC(..., :m))
varInfo = sample(gaussdemo, data = Dict(:x=>[1 2]), varInfo, PG(..., :s))
Regarding the new naming scheme, will it be good if we just name them like below?
:(x[1])
:(x[1,1])
:((x[1])[1])
It looks more straightforward and can easily distinguish between array of array and matrix.
Update
This is done in https://github.com/yebai/Turing.jl/pull/70.
Regarding the new interface, I feel we still need the @model
macro because for the case below
gaussdemo(data=Dict(:x=>Float64[1 2])) = begin
s ~ InverseGamma(2,3)
m ~ Normal(0,sqrt(s))
for i=1:length(x)
x[i] ~ Normal(m, sqrt(s))
end
return(s, m, x)
end
We will need to add a statement in the beginning of the function saying
x = data[:x]
Otherwise the there will be no local x
to use.
Alternatively we can ask user to define data globally, but i think it's not what we want
Regarding the new interface, I feel we still need the @model macro because for the case below
I agree. It makes sense to keep it.
Regarding the new naming scheme, will it be good if we just name them like below? :(x[1]) :(x[1,1]) :((x[1])[1])
This makes sense. One minor question, can we use :(x[1][1])
instead of :((x[1])[1])
?
Yes sure, could use x[1][1]
!
When we compose two samplers, e.g. HMC(...)
and PG(...)
, is it true that we do Gibbs sampling by 1 HMC step, 1 PG step, iteratively?
When we compose two samplers, e.g. HMC(...) and PG(...), is it true that we do Gibbs sampling by 1 HMC step, 1 PG step, iteratively?
We can do that by default. It is also possible to add an optional parameter iter
to allow the user to indicate the number of iterations inside a Gibbs scan:
res = sample(gaussdemo, data = Dict(:x=>[1 2]), HMC(..., iter = 1, :m), PG(..., iter = 1, :s))
I nearly finish the Gibbs sampling feature, with few things unsure. One of them is about using HMC and PG together: when a HMC step is rejected, should I reject the whole Gibbs step, or just reject the HMC part?
I guess it will be better for me to make a PR first so that we can deal with existing issues together. I will make a PR this afternoon.
Some updates since PR #73. The new bits are:
Data
, which implicitly constructs a data dictionary and stores all data variables in this internal dictionary. Moreover, passing in data variables through global variables should be depreciated later.Gibbs
, which allows us to compose different samplers within a Gibbs sampling algorithm. @model gdemo(x) = begin
s ~ InverseGamma(2,3)
m ~ Normal(0,sqrt(s))
for i=1:length(x)
x[i] ~ Normal(m, sqrt(s)) # Note: we need to fetch x from gdemo(; data = Data(...)).
end
return(s, m, x)
end
x = [2.0, 3.0]
sampler = Gibbs(HMC(..., :m), PG(..., :s))
chn = @sample(gdemo(x), sampler)
NOTE: updated sampling interface using macro (23 Feb, 2017), #82
Regarding the introduction of the new Data
type, I'm thinking of doing this job along with unifying the dictionary (used by @predict
) and VarInfo
, because basically what these 3 data structure do is to relate a symbol (variable) to a value (uni/multi/mat/array). Does it make sense?
Close in favour of #104
We have some undecided interface issues since the introduction of the Hamiltonian Monte Carlo (HMC) sampler. More specifically, we need an interface to tell the HMC sampler which variable it should handle. In addition, we also need a way to compose samplers (e.g. Gibbs sampling subsets of variables with different samplers). To solve these issues altogether, I propose the following new interface.
In summary, first, this design removed macros
@assume
,@observe
and@predict
. Instead, it introduces additional parameters to each sampler to distinguish different types of variables. Second, this design introduces an interface to compose samplers (e.g. HMC and PG).