Allow contraction of composite nodes

ReactiveBayes / RxInfer.jl

Julia package for automated Bayesian inference on a factor graph with reactive message passing

MIT License

237 stars 24 forks source link

Allow contraction of composite nodes #287

Open wouterwln opened 2 months ago

wouterwln commented 2 months ago

Should be very easy (parameterizing the backend) since all machinery is already there in GraphPPL, but we should expose (and document) the API to users

blolt commented 1 month ago

Hello, could you expand somewhat on what this task entails? I'd like to tackle it or a portion of it as a first issue, but I will likely need some hand-holding if that is okay.

Anticipating difficulty will arise in 1. grasping the problem domain, and 2. familiarizing myself with the code base. But, overcoming both of those is the entire reason I would like to pursue this task. Let me know what I can do!

blolt commented 1 month ago

I have done some reading based off of some keyword searches in this post. My assumption is that we are talking about graph contraction here. It looks like union-find would be sufficient for solving this, which is outlined here. Let me know if that understanding is correct. If so, I am fairly sure I've implemented this before in some algorithms class.

I'll be checking my understanding in GraphPPL in the meantime.

wouterwln commented 1 month ago

Hi @blolt , thanks for picking this up! This issue basically says that we want to be able to do the following: Let's say I build a custom submodel, called gcv:

@model function gcv(κ, ω, z, x, y)
    log_σ := κ * z + ω
    y ~ NormalMeanVariance(x, exp(log_σ))
end

and we want to embed this in some larger model, say a Hierarchical Gaussian Filter. Without going into too much detail, this model can be written by stacking a couple of gcv modules:

@model function hgf(y)

    # Specify priors

    ξ ~ Gamma(1, 1)
    ω_1 ~ NormalMeanVariance(0, 1)
    ω_2 ~ NormalMeanVariance(0, 1)
    κ_1 ~ NormalMeanVariance(0, 1)
    κ_2 ~ NormalMeanVariance(0, 1)
    x_1[1] ~ NormalMeanVariance(0, 1)
    x_2[1] ~ NormalMeanVariance(0, 1)
    x_3[1] ~ NormalMeanVariance(0, 1)
    y[1] ~ NormalMeanVariance(x_1[1], 1)

    # Specify generative model

    for i in 2:(length(y) + 1)
        x_3[i] ~ NormalMeanVariance(μ = x_3[i - 1], τ = ξ)
        x_2[i] ~ gcv(x = x_2[i - 1], z = x_3[i], ω = ω_2, κ = κ_2)
        x_1[i] ~ gcv(x  = x_1[i - 1], z = x_2[i], ω = ω_1, κ = κ_1)
        y[i] ~ NormalMeanVariance(x_1[i], 1)
    end
end

Now, if we materialize this entire model, there are some nasty messages inside the gcv nodes that we don't really know how to resolve. However, here it is shown that we can "close the box" around this gcv component, and if we see it as a single node, we do have nice closed form messages around this contracted node. In fact, this is exactly what the GCV node in RxInfer does now. Now when building GraphPPL, we made sure to include a feature like this. It works as follows:

During the construction of a node/submodel (essentially, when resolving a ~ statement), we check if the function you invoke is either an atomic node (so we materialize the node) or if it's a submodel and we have to apply the internal logic of the submodel. This check is done with the NodeType(backend, fform) function. This function returns either Composite() or Atomic(), and creation logic will resolve accordingly. Now, the reason backend is in there is because GraphPPL doesn't want to depend on a specific inference engine, and we can assign custom behaviour to the backend in RxInfer, namely, the RxInferBackend. What I want to do is the following:

Parameterize the RxInferBackend with some extra parameter, contract or something like that, that will be either True() or False() from Static.jl, such that the type becomes RxInferBackend{T}.
Let the user implement, for example, GraphPPL.NodeType(::RxInferBackend{True}, ::typeof(gcv)) = GraphPPL.Atomic()
Expose a keyword in the infer function that toggles this behaviour.

Note that the user would need to write their own custom @rules for the composite node they want to contract, and that we should write an extensive documentation page elaborating on this behaviour.

In short, this is a perfect way to dive into some nitty gritty details of RxInfer and the model construction in particular. We've made sure that all the mechanics are there to build this, but there is no exposed API available to the user yet. I hope this explanation points you in the right direction. If you have any questions, feel free to ask

blolt commented 1 month ago

Hey @wouterwln ! Looks like my intuition was way off! Thanks for the links and the incredibly thorough explanation. I looked through some of the relevant code this morning and will make some time to work through this more later this evening (EST). Thanks again!

blolt commented 1 month ago

Hey @wouterwln! I have had some more time to look at RxInfer and GraphPPL, as well as learning more about Julia. I have some questions and ideas about the proposed approach for this issue that I am hoping to discuss with you. Mostly, I just have questions about the use of parametric types in RxInfer.

Parameterize the RxInferBackend with some extra parameter, contract or something like that, that will be either True() or False() from Static.jl, such that the type becomes RxInferBackend{T}.

I noticed that parameterization like this seems to be common throughout the reactiveBayes packages, but I am having difficulty understanding why (for example, 18 parametric types on the RxInferenceEngine struct seems like overkill). I believe the backend we are looking to parameterize is defined here, but please let me know if this is wrong. Here is my understanding of how the implementation you recommend would work.

# new parameterized struct
struct ReactiveMPGraphPPLBackend{T}
    contract::T
end

# multiple dispatch implementation of NodeType provided by RxInfer user
function GraphPPL.NodeType(::ReactiveMPGraphPPLBackend{True}, ::typeof(gcv))
    # custom behaviour defined by user
    return GraphPPL.Atomic()
end

# ... infer changes (haven't gotten this far. The flow of data between infer and the graphPPL backend is not fully clear to me, but it seems to go through the model macro for each time we call infer via ~).

Is this correct to your suggestion? This is an odd usage of a "generic" (parametric) type that I have not seen before. It seems like we are only interested in using the generic type as boolean, and like it is being used for specific functionality. Why not simply use a boolean value within the struct, then, like so?

struct ReactiveMPGraphPPLBackend
    contract::Bool

    ReactiveMPGraphPPLBackend() = new(true)
end

Parameterizing this struct has the downside of breaking 55 existing unit tests, which expect an unparameterized interface, whereas adding a boolean maintains the interface and allows all existing unit tests to pass. In general, it seems like good practice to be explicit about the types we are using where we can be, too.

I would appreciate better understanding why we see parameterization + multiple dispatch as the best approach here. Also, I hope this does not come across as pedantic. I understand that these are mostly implementation details that will be covered in review, but I think they get at some of the philosophy behind how this package is written, so I would appreciate understanding them from that perspective.

wouterwln commented 1 month ago

Hey @blolt , this is because of type stability issues. Let's look at the case that we put a contract bool inside the ReactiveMPGraphPPLBackend:

function GraphPPL.NodeType(backend::ReactiveMPGraphPPLBackend, ::typeof(gcv))
    if backend.contract
        return GraphPPL.Atomic()
        else
            return GraphPPL.Composite()
        end
end

Now this function is type unstable, because we cannot statically determine what the return type of this function will be. Type instability is quite a strange thing to get your head around when you first start using Julia (I know for one that I had some issues understanding it), but it boils down to this:

The Julia compiler, before executing any of your code, tries to statically determine the return type of any function. Note that we do not have access to the internal variables here, only the type of the parameters. This is done because if Julia knows all types of all inputs, the compiler can compile away the multiple dispatch (if all types are known, there should only be 1 candidate implementation we can run, and we should run that one), instead of having to search for a candidate function during runtime.

If we put contract inside the ReactiveMPGraphPPLBackend, the return type of NodeType is either GraphPPL.Atomic or GraphPPL.Composite, but the point is: without a specific instance of ReactiveMPGraphPPLBackend, we don't really know what the return type is. That means that, during model construction, we cannot call the right implementation of GraphPPL.make_node! (which has different behaviour for Atomic or Composite nodes) instantaneously, but first have to check (during runtime) whether the node we want to create is Atomic or Composite, and do dynamic multiple dispatch during runtime in order to call the right code. You can try this out in a dummy environment using only GraphPPL and a dummy custom backend (https://reactivebayes.github.io/GraphPPL.jl/stable/custom_backend/) (although this is quite some work to do to see that this would be slower) to see what the result of this would be. I'm willing to bet model construction would be +- 10 times slower because of the type instability.

Now, this type parametrization is not a ReactiveBayes specific design pattern, but a common Julia practice. What we do is, when the type of one of the fields of a struct influences the behaviour of the object, we parameterize the type. The difference is this: When we have the following:

struct ReactiveMPGraphPPLBackend{T}
    contract::T
end

Whenever we create an instance of this type, the value of contract is actually statically known, since it is either True() or False(). This trick of having certain "values" known at compile time is further exploited in Static.jl, and is a common Julia trick. Usually, you'd want to make sure that as much information is known at compile time (as this reduces type instabilities), since from my experience type instabilities usually are the biggers culprit when tackling performance issues.

More info on type stability: https://www.juliabloggers.com/writing-type-stable-julia-code/

Package I use to track type instabilities: https://github.com/timholy/ProfileView.jl

Hope this helps!

blolt commented 1 month ago

Hey @wouterwln! I really appreciate your thorough answers here. The need for type stability here makes sense to me.

I have spent a good amount of time reading through the RxInfer and GraphPPL code now, but I remain blocked on understanding how to allow the user to toggle backend functionality through the infer function. Here are the changes / new methods I have made.

struct ReactiveMPGraphPPLBackend{T}
    shouldCreateCompositeNodes::T

    ReactiveMPGraphPPLBackend{T}() where {T} = new{T}(nothing)
    ReactiveMPGraphPPLBackend{T}(val::Bool) where {T} = new{T}(val)
end

macro model(model_specification)
    return esc(GraphPPL.model_macro_interior(ReactiveMPGraphPPLBackend{true}, model_specification))
end

function GraphPPL.instantiate(::Type{ReactiveMPGraphPPLBackend{True}})
    return ReactiveMPGraphPPLBackend{True}(true)
end

It seems like this should allow the user to implement something like,

GraphPPL.NodeType(::ReactiveMPGraphPPLBackend{True}, ::typeof(gcv)) = GraphPPL.Atomic()

Of course, it doesn't allow for the user to define any behavior when the type is ReactiveMPGraphPPLBackend{False}. Unfortunately, I'm not sure how we could dynamically change the parameterization of the backend from the infer function, since it seems to be defined by the model macro. The only things I have thought of here is somehow allowing for the model_specification to parameterize the backend, or perhaps to supply a plugin for parameterizing the backend via the model generator we supply to infer (if that is possible). Please let me know what you think the right approach here might be.

Thank you!

wouterwln commented 4 weeks ago

Hi @blolt, I see now that the change I propose is not as straightforward as I made it seem. I've looked into it a bit more, and let me explain:

I don't think we should change the @model macro, since it should still work for any instance of ReactiveMPGraphPPLBackend. The main change that I would propose would be here . Now, we call GraphPPL.with_plugins, and I propose we add a call to GraphPPL.with_backend (defined here) to include a specific backend with model construction. If we add a keyword argument to the infer function that toggles between true and false, we can replace this line with _model = GraphPPL.with_backend(GraphPPL.with_plugins(model, modelplugins), ReactiveMPGraphPPLBackend(static(contract))).

Now, a really sharp reader will say: But Wouter, this creation of the backend will be type unstable, since we still don't know the return type of ReactiveMPGraphPPLBackend(static(contract)) on compile time. However, instead of having a type unstable call every time we create a node, we have 1 type unstable call at the start of the procedure, which should not affect performance as much as a type instability down the call chain. For the rest, the implementation should stay the same.

By the way, I think the following should be enough:

struct ReactiveMPGraphPPLBackend{T}
    shouldCreateCompositeNodes::T
end

to realize this change. @bvdmitri what do you think? I think with some smart dispatching we can make this entire procedure type stable, but it might put more of a burden on the user (for example, having to supply Static objects themselves instead of booleans).

bvdmitri commented 4 weeks ago

but I remain blocked on understanding how to allow the user to toggle backend functionality through the infer function

Unfortunately, I'm not sure how we could dynamically change the parameterization of the backend from the infer function, since it seems to be defined by the model macro.

It is relatively simple:

import RxInfer: ReactiveMPGraphPPLBackend
import GraphPPL: with_backend

using RxInfer, Random

n = 500  # Number of coin flips
p = 0.75 # Bias of a coin

distribution = Bernoulli(p) 
dataset      = rand(distribution, n)

@model function coin_model(y, a, b) 
    θ  ~ Beta(a, b)
    y .~ Bernoulli(θ) 
end

result = infer(
    model = with_backend(coin_model(a = 2.0, b = 7.0), ReactiveMPGraphPPLBackend()), # pass some extra args to the backend
    data  = (y = dataset, )
)

or if you prefer piped operation

result = infer(
   model = coin_model(a = 2.0, b = 7.0) |> (m) -> with_backend(m, ReactiveMPGraphPPLBackend()),
   data  = (y = dataset, )
)

Regarding type stability, writing ReactiveMPGraphPPLBackend(shouldContract = true) and using regular booleans shouldn’t pose significant issues. Julia has robust constant value propagation, so converting a regular boolean to a static boolean (True/False) inside the constructor is typically handled efficiently. E.g.:

julia> struct Hello{T} end

julia> foo(x) = x ? True() : False()

julia> bar(x) = Hello{foo(x)}()
bar (generic function with 1 method)

julia> bar(true)
Hello{static(true)}()

julia> using BenchmarkTools

julia> @btime bar(true)
  0.791 ns (0 allocations: 0 bytes)
Hello{static(true)}()

Even if type-unstable, it will only be so during creation time, and will have a concrete type when it reaches the infer function. This is a minor multiple dispatch issue and should not be a major concern.

The actual constructor of course should be parametrised with True/False in its type-signature in order to allow this:

GraphPPL.NodeType(::ReactiveMPGraphPPLBackend{True}, ::typeof(gcv)) = GraphPPL.Atomic()

As a side note, you could implement it like this:

allows_contraction(::ReactiveMPGraphPPLBackend{True}) = True()
allows_contraction(::ReactiveMPGraphPPLBackend{False}) = False()

GraphPPL.NodeType(backend::ReactiveMPGraphPPLBackend, node) = GraphPPL.NodeType(backend, allows_contraction(backend), node)

GraphPPL.NodeType(::ReactiveMPGraphPPLBackend, ::True, ::typeof(gcv)) = GraphPPL.Atomic()

This way, changing the order of type parameters (e.g. if someone in the future would add some extra T e.g. ReactiveMPGraphPPLBackend{T, True}) would only require adjusting the allows_contraction function, instead of all NodeTypes.