ReactiveBayes / RxInfer.jl

Julia package for automated Bayesian inference on a factor graph with reactive message passing
MIT License
265 stars 23 forks source link

Inference over inputs #326

Closed caxelrud closed 2 months ago

caxelrud commented 2 months ago

Hi, I looking for a way to do inference over the inputs of a model. For example, for linear regression case, I looking for calculating x by assigning it missing and giving values of y. I see that the example Bike Rental example does similar by relating inputs with the states of a filter (I am looking for a simpler way).

@model function linear_regression(x, y)
    a ~ Normal(mean = 0.0, variance = 1.0)
    b ~ Normal(mean = 0.0, variance = 100.0)    
    y .~ Normal(mean = a.* x .+ b, variance = 1.0)
end
Nimrais commented 2 months ago

Hi, that's an interesting question.

I've written a model where I don't have x as observations, but only a prior over them. I'm running the model in full mean-field mode, so I need to initialize all marginals for my variables.

Can you extend this for your use case? In my results I have a posteriors over x in results_unknown_x.posteriors[:x].

using Distributions, RxInfer
using StableRNGs

function generate_data(a, b, v, nr_samples; rng=StableRNG(1234))
    x = float.(collect(1:nr_samples))
    y = a .* x .+ b .+ randn(rng, nr_samples) .* sqrt(v)
    return x, y
end;

@model function linear_regression(y)
    local x
    a ~ Normal(mean = 0.0, variance = 1.0)
    b ~ Normal(mean = 0.0, variance = 100.0)
    γ ~ GammaShapeRate(1, 1000)
    for i in 1:length(y)
        x[i] ~ Normal(mean = 0.0, variance = 1.0)
        _y[i] ~ softdot(a, x[i], γ)
        y[i] ~ Normal(mean = _y[i] + b, variance = 1.0)
    end
end

_, y_data_un = generate_data(0.5, 25.0, 10.0, 250)

imarginals = @initialization begin
    q(b) = NormalMeanVariance(0.0, 100.0)
    q(x) = NormalMeanVariance(0.0, 1.0)
    q(a) = NormalMeanVariance(0.0, 1.0)
    q(γ) = GammaShapeRate(1, 1000)
    q(_y) = NormalMeanVariance(0.0, 100.0)
end

results_unknown_x = infer(
    model           = linear_regression(), 
    data            = (y = y_data_un,), 
    returnvars      = (a = KeepLast(), x = KeepLast()),
    initialization = imarginals,
    constraints     = MeanField(),
    iterations      = 20
)
caxelrud commented 2 months ago

Hi. If you look into Bayesian Network (software), as Bayes Server, you will find out that it is always estimating the distribution of the "inputs" condition to the other nodes. Check at Bayes Server. The use case that I have been applying is described in Anomaly Dection. It used the log_likelihood of the system as system's "measure of normal" and uses it to discover the probable root-cause "input" by replacing the predicted input (one at a time, two at a time,...). It is well described in the link.

albertpod commented 2 months ago

I will transfer this issue into discussion.