Closed ValentinKaisermayer closed 4 years ago
I can confirm this is the case (Julia 1.4.1
and Turing 0.13.0
on OS X
).
Thanks! There are some other problems with the linear regression tutorial currently as well. Among other things, the standardization is incorrect (see https://github.com/TuringLang/TuringTutorials/issues/73), it contains a data leakage, and the model (or its documentation) are incorrect. I'm currently updating the tutorial, and hopefully that will resolve the problems that you observed as well.
Tested it with the updated tutorial code:
using Turing, Distributions
using RDatasets
using MCMCChains, Plots, StatsPlots
using MLDataUtils: shuffleobs, splitobs, rescale!
using Distances
using Random
Random.seed!(0)
Turing.turnprogress(false)
data = RDatasets.dataset("datasets", "mtcars")
select!(data, Not(:Model))
trainset, testset = splitobs(shuffleobs(data), 0.7)
target = :MPG
train = Matrix(select(trainset, Not(target)))
test = Matrix(select(testset, Not(target)))
train_target = trainset[:, target]
test_target = testset[:, target]
μ, σ = rescale!(train; obsdim = 1)
rescale!(test, μ, σ; obsdim = 1)
μtarget, σtarget = rescale!(train_target; obsdim = 1)
rescale!(test_target, μtarget, σtarget; obsdim = 1)
# Bayesian linear regression.
@model function linear_regression(x, y)
σ₂ ~ truncated(Normal(0, 100), 0, Inf) # Set variance prior.
intercept ~ Normal(0, sqrt(3)) # Set intercept prior.
nfeatures = size(x, 2)
coefficients ~ MvNormal(nfeatures, sqrt(10)) # Set the priors on our coefficients.
mu = intercept .+ x * coefficients # Calculate all the mu terms.
y ~ MvNormal(mu, sqrt(σ₂))
end
model = linear_regression(train, train_target)
chain = sample(model, NUTS(0.65), 3_000)
plot(chain)
describe(chain)
Looks good but I get two warnings.
[ Info: [Turing]: progress logging is disabled globally
┌ Info: Found initial step size
└ ϵ = 1.6
┌ Warning: The current proposal will be rejected due to numerical error(s).
│ isfinite.((θ, r, ℓπ, ℓκ)) = (true, false, false, false)
└ @ AdvancedHMC
┌ Warning: The current proposal will be rejected due to numerical error(s).
│ isfinite.((θ, r, ℓπ, ℓκ)) = (true, false, false, false)
└ @ AdvancedHMC
2-element Array{ChainDataFrame,1}
Summary Statistics
parameters mean std naive_se mcse ess r_hat
──────────────── ─────── ────── ──────── ────── ──────── ──────
coefficients[1] -0.0413 0.5648 0.0126 0.0389 265.1907 1.0010
coefficients[2] 0.2770 0.6994 0.0156 0.0401 375.2777 1.0067
coefficients[3] -0.4116 0.3850 0.0086 0.0160 695.3990 1.0032
coefficients[4] 0.1805 0.2948 0.0066 0.0126 479.9290 1.0010
coefficients[5] -0.2669 0.7168 0.0160 0.0316 373.0291 1.0009
coefficients[6] 0.0256 0.3461 0.0077 0.0119 571.0954 1.0028
coefficients[7] 0.0277 0.3899 0.0087 0.0174 637.1596 1.0007
coefficients[8] 0.1535 0.3050 0.0068 0.0117 579.1998 1.0032
coefficients[9] 0.1223 0.2839 0.0063 0.0105 587.6752 0.9995
coefficients[10] -0.2839 0.3975 0.0089 0.0195 360.9612 1.0019
intercept 0.0058 0.1179 0.0026 0.0044 580.0222 0.9995
σ₂ 0.3017 0.1955 0.0044 0.0132 227.2322 1.0005
Quantiles
parameters 2.5% 25.0% 50.0% 75.0% 97.5%
──────────────── ─────── ─────── ─────── ─────── ──────
coefficients[1] -1.0991 -0.4265 -0.0199 0.3244 1.1093
coefficients[2] -1.1369 -0.1523 0.2854 0.7154 1.6488
coefficients[3] -1.1957 -0.6272 -0.3986 -0.1800 0.3587
coefficients[4] -0.3896 -0.0155 0.1663 0.3593 0.7818
coefficients[5] -1.6858 -0.6835 -0.2683 0.1378 1.1995
coefficients[6] -0.6865 -0.1672 0.0325 0.2214 0.7251
coefficients[7] -0.7644 -0.1976 0.0090 0.2835 0.8185
coefficients[8] -0.4980 -0.0194 0.1451 0.3428 0.7685
coefficients[9] -0.4643 -0.0294 0.1237 0.2807 0.7218
coefficients[10] -1.0898 -0.5091 -0.2846 -0.0413 0.5163
intercept -0.2240 -0.0671 0.0083 0.0746 0.2364
σ₂ 0.1043 0.1860 0.2525 0.3530 0.8490
I'm pretty sure you can ignore the errors, it just tells you that two proposals were rejected since the gradient and some other values that I don't remember (maybe a more descriptive warning would be better? I saw https://github.com/TuringLang/AdvancedHMC.jl/issues/110, but IMO the variable names θ, r, ℓπ, ℓκ
are a bit cryptic) were non-finite (e.g., NaN
or Inf
). I would assume that might be related to the truncated normal distribution in this example, but I guess @xukai92 knows a lot better what could possibly go wrong here.
They can be safely ignored. During the initial phase of the sampling, large step sizes were tried which might lead to numerical errors, which can be ignored as long as there is no consistent errors during sampling.
Hi @devmotion,
I just tried to run the tutorial and it stops at line
μ, σ = rescale!(train; obsdim = 1) with ERROR: MethodError: no method matching /(::String, ::Int64)
rhe rescale! function cannot cope with strings. Hence I changed the code as follows
strExclude = :Model train = Matrix(select(trainset, Not([target, strExclude]))) test = Matrix(select(testset, Not([target, strExclude])))
The model then runs and prints the coefficients. However, for the tutorial it would be great to pass the names of the covariates to the coefficients. I'm pretty sure that is possible with turing.jl, but at the moment, I don't know how to do that. Could you please elaborate the code.
Thanks in advance
I just tried to run the tutorial and it stops at line
I assume you forgot to run the line select!(data, Not(:Model))
, which removes the Model
column from the data set. The fix should not be needed (and the code ran fine when the tutorial was generated). Can you check if you forgot to ran this line before creating the training and test data sets?
However, for the tutorial it would be great to pass the names of the covariates to the coefficients.
Hmm I'm not sure if I understand you correctly. Would you like to name the coefficients in a way that relates them to the covariates in the model? Or would you mainly like to use the covariates in the trace plots instead of coefficients[1]
etc.? Turing does not work with data of type DataFrame
but only with arrays (and hopefully at some points whatever samples are generated by the distributions you specify). Therefore we convert the data frame to a matrix in the beginning. However, this conversion removes all headers and hence one has to relate the columns of the matrix and the samples of coefficients
to the covariates manually. It is possible, e.g., to change the parameter names of the sampled chain by
newchain = let names=names(data, Not(target))
set_names(chain, Dict("coefficients[$i]" => name for (i, name) in enumerate(names)))
end
with MCMCChains 3.0.* or
newchain = let names=names(data, Not(target))
replacenames(chain, ("coefficients[$i]" => name for (i, name) in enumerate(names))...)
end
for MCMCChains 4.0.*.
Hi @devmotion, thanks for your fast response. In fact, I somehow missed the select!-line. Now it runs as expected.
In terms of getting meaningful names to the coefficients I tried the proposed solution for MCMCChains 3.0* and get this error
julia> newchain = let names=names(data, Not(target)) set_names(chain, Dict("coefficients[$i]" => name for (i, name) in enumerate(names))) end ERROR: MethodError: no method matching names(::DataFrame, ::InvertedIndex{Symbol})
names(data, Not(target)) seems not to work. Then I tried this
julia> newchain = let names=names(select(data, Not(target))) set_names(chain, Dict("coefficients[$i]" => name for (i, name) in enumerate(names))) end ERROR: MethodError: no method matching Chains(::Array{Float64,3}, ::Array{Any,1}, ::Dict{Any,Any}; info=NamedTuple(), evidence=missing, sorted=true)
While names(select(data, Not(target))) produces a 10-element array it is not processed by the loop.
How to get it to work? Better install MCMChains 4.0.*?
@devmotion Update: I found the solution
newchain = let names=String.(names(select(data, Not(target)))) set_names(chain, Dict("coefficients[$i]" => name for (i, name) in enumerate(names))) end
Cheers
Great, that it works! I'm sorry, I didn't test the suggested renaming and just assumed that names(data, Not(target))
would work due to the DataFrames documentation.
I'll close this issue since it seems your issue is resolved.
I copied the code but did not get the correct results:
Julia: 1.4.2 Turing: 0.13.0