TuringLang / TuringGLM.jl

Bayesian Generalized Linear models using `@formula` syntax.
https://turinglang.org/TuringGLM.jl/dev
MIT License
70 stars 7 forks source link

use intercept traits w/ `apply_schema` to handle implicit intercept #39

Closed kleinschmidt closed 1 year ago

kleinschmidt commented 2 years ago

StatsModels provides the notion of "intercept traits" https://github.com/JuliaStats/StatsModels.jl/blob/master/src/traits.jl to control the behavior of the automatic intercept adding that normally happens in GLMs. This package is a great example of a model type with an "implicit intercept" "dropped intercept", since everything is centered (AFAICT) so the intercept is always zero and shouldn't be included.

I think what you'd want to do is to define a model type (if there isn't one already) and add methods for the appropriate types.

storopoli commented 2 years ago

I am trying to have the least middleware possible between @formula and Turing.jl. What would be the added benefits?

kleinschmidt commented 2 years ago

There's some tricky business with how the model matrix is constructed given categorical variables and interactions/intercepts. If you have something like y ~ 1 + x where x has say, 3 unique levels, usually two columns will be generated for x and one for the intercept. If you do y ~ x, an intercept is usually considered to be present implicitly, so you get teh same thing. But for y ~ 0 + x, the intercept is suppressed and x is "promoted" to full rank, so there will be three columns generated for x (full dummy coding).

Same thing happens with interactions and main effects.

Anyway, it's rare this comes up in user code but it does happen sometimes

phipsgabler commented 2 years ago

In principle, you can dispatch on DPPL models since a couple of months (the purpose of which was exactly to be able to define traits):

julia> @model function m1()
           d ~ filldist(DiscreteUniform(1, 3), 3)
           return d
       end
m1 (generic function with 2 methods)

julia> trait(::Model{typeof(m1)}) = true
trait (generic function with 1 method)

julia> trait(m1())
true

You'd only need to move the closures out of turing_model to get access to the function.

storopoli commented 1 year ago

We currently don't implement/extend the StatisticalModel type from StatsModels.jl. So I think that the traits are to be applied to those types. I am closing this, we might revisit this in the future if we need to use StatisticalModel.