biaslab / ForneyLab.jl

Julia package for automatically generating Bayesian inference algorithms through message passing on Forney-style factor graphs.
MIT License
149 stars 33 forks source link

@ffg macro #131

Open ivan-bocharov opened 3 years ago

ivan-bocharov commented 3 years ago

@ffg macro

This PR implements @ffg macro for defining Forney-style factor graphs.

Example:

using ForneyLab

@ffg function gaussian_add(prior)
    x ~ GaussianMeanVariance(prior.m, prior.v)
    y = x + 1.0 ∥ [id=:y]
    placeholder(y, :y)
end

prior = (m=0.0, v=1.0)
graph = gaussian_add(prior)

@ffg macro rewrites the function body, replacing statements that result in a variable construction with ForneyLab-specific code.

Model specification language description

Model specification macro supports following constructions:

  1. Tilde operator
x ~ GaussianMeanVariance(0.0, 1.0)

This construction behaves the same way as current version of ForneyLab.

  1. Assignment operator
x = a + b

This expression always creates a new variable. By default it uses autogenerated variable ids (variable_1, variable_2, etc.). If you want to overload the variable's id, use options specificator ("parallel to" Unicode symbol) in a following way:

x = a + b ∥ [id=:x]

This behaviour allows execution of arbitrary Julia code inside @ffg macro.

You can use LaTeX command for the options specificator in Julia mode (\parallel).

  1. Arrow-notation assignment operator
x ← a + b

This expression always creates a new variable. By default it uses extracted variable id (x for the example above). If you want to overload the variable's id, use the same options specificator () as in the previous construction:

x ← a + b ∥ [id=:x]

You can use LaTeX command for the arrow in Julia mode (\leftarrow).

  1. The rest of the expressions are interpreted as standard Julia code blocks.

Inference algorithm and posterior factorization construction

Since defined variables no longer belong to the global scope, InferenceAlgorithm and PosteriorFactorization constructors now accept variable ids as inputs. For example:

algo = messagePassingAlgorithm([:x_t, :r], free_energy=true)
# Define a factorization for the posterior
q = PosteriorFactorization(:x_t, :τ, ids=[:x, :τ])
# Define a variational message passing procedure
algorithm = messagePassingAlgorithm([:x_t, :τ], q, free_energy=true)

Demo

Nonlinear Kalman filter demo has been adapted to showcase application of the new model specification language.

Known caveats

ivan-bocharov commented 3 years ago

Alternative proposal for options specificator syntax (by @bvdmitri):

 y = x + 1 where { id = :y }
albertpod commented 3 years ago

Nice! The specification of the graph is more relaxed. Although, I don't like the proposed specificator syntax.

bertdv commented 3 years ago

I am absolutely not a code expert, but here are some thoughts that I have when reading this proposal.

  1. I think the demo code is cleaner than before, and that is the main point of the change, so in general I think this is a good change!

  2. The point of this model specification language (MSL) is readability. With that in mind, I prefer

    y = x + 1 where { id = :y }

    over

    x ← a + b ∥ [id=:x]

    The former does not need an explanation of what \parallel means. I dont know why the brackets { and } or [ and ] are used, but where is clearer to me than \parallel.

  3. It's not clear to me what is the difference between the assignment with = vs with \leftarrow.

ivan-bocharov commented 3 years ago

Thanks for your comment, @bertdv.

I agree that where syntax reads easier than \parallel one. I had my reservations because it is a construction that has a very particular meaning in the host language. I have changed my mind after some discussions with Dmitri. It seems like a good fit for what we want to do with that construction as the language evolves.

As for arrow assignment - originally I wanted a construction that works the same way as an assignment under @RV macro (overriding variable id). It is not really necessary and we can drop it with no consequences.

ivan-bocharov commented 3 years ago

This branch now supports where-syntax for defining options for your model definition. I haven't yet implemented handling of multiple where blocks, but I've covered the most relevant usecases.

ThijsvdLaar commented 3 years ago

Hi @ivan-bocharov , looks cool, I like how this enables the user to pass hyper-parameters to the model constructor. I tried to rewrite the Kalman smoother demo with the @ffg macro:

@ffg function stateSmoother(n_samples)
    # Prior statistics
    m_x_0 = placeholder(:m_x_0)
    v_x_0 = placeholder(:v_x_0)

    # State prior
    x_0 ~ GaussianMeanVariance(m_x_0, v_x_0)

    # Transition and observation model
    x = Vector{Variable}(undef, n_samples)
    y = Vector{Variable}(undef, n_samples)

    x_t_min = x_0
    for t = 1:n_samples
        n_t ~ GaussianMeanVariance(0.0, 200.0) # observation noise
        x[t] = x_t_min + 1.0
        y[t] = x[t] + n_t

        # Data placeholder
        placeholder(y[t], :y, index=t)

        # Reset state for next step
        x_t_min = x[t]
    end
end

However, the x[t] are not automatically named as was the case previously. If they were automatically named, an "aggregate" id could be passed to the algorithm constructor as :x_*(1:n_samples), which automatically expands to [:x_1, :x_2, ...].

ivan-bocharov commented 3 years ago

Hi @ThijsvdLaar, thanks a lot for the feedback. Truth is that it is very hard (I think impossible if you consider all the possible cases) to figure out whether an assignment statement results in a construction of a new variable at parse time. So, if you want to have meaningful ids for variables that are created with assignment statements, you should use where construction.

In the past week I have realized, however, that we can let the user return anything they want from the function. The graph will always be returned first, and the rest will be concatenated to it. So, the demo could look something like:

@ffg function stateSmoother(n_samples)
    # Prior statistics
    m_x_0 = placeholder(:m_x_0)
    v_x_0 = placeholder(:v_x_0)

    # State prior
    x_0 ~ GaussianMeanVariance(m_x_0, v_x_0)

    # Transition and observation model
    x = Vector{Variable}(undef, n_samples)
    y = Vector{Variable}(undef, n_samples)

    x_t_min = x_0
    for t = 1:n_samples
        n_t ~ GaussianMeanVariance(0.0, 200.0) # observation noise
        x[t] = x_t_min + 1.0 where { id = :x_*t }
        y[t] = x[t] + n_t

        # Data placeholder
        placeholder(y[t], :y, index=t)

        # Reset state for next step
        x_t_min = x[t]
    end
    return x
end

...

(graph, x) = stateSmoother(10)
messagePassingAlgorithm(x, ...)
ThijsvdLaar commented 3 years ago

I see, nice, simply returning x::Vector{Variable} then looks even simpler.

To me, the still has added value if it allows us to circumvent the where statement, i.e. x[t] ← x_t_min + 1.0, as a shortcut to x[t] = x_t_min + 1.0 where { id = :x_*t }, just looks cleaner to me.