Tractables / ProbabilisticCircuits.jl

Probabilistic Circuits from the Juice library
https://tractables.github.io/ProbabilisticCircuits.jl/dev
Apache License 2.0
104 stars 11 forks source link

Continuous input distributions #128

Open trappmartin opened 1 year ago

trappmartin commented 1 year ago

Hi, @ams939 and I just realized that there are no continuous input nodes. We would add codes for Gaussian distributions. Are there any things we should keep in mind when doing so?

Thanks, Martin

trappmartin commented 1 year ago

@khosravipasha Do you have any suggestions on how to best code up the GPU related bits?

khosravipasha commented 1 year ago

Hi, Adding Guassian should be relatively straightforward. Mainly need to keep track of sufficient statisitics needed during training. We all the flow function to accumulate the sufficent statistics needed, also need to multiply them by the flow value which is basically probability of getting to that input node given input x. The formula would be similar to equition 7 from einsum network paper (difference is if you have missing values during training we have a special case for that, need to also track "missing flow").

image

I guess for guassian with fixed sigma, only need to track the sum node_flow and sum node_flow * value (and if might have missing data during training also want to add up flow for missing values, see example for binomial here, should be very similar for Gaussian

https://github.com/Juice-jl/ProbabilisticCircuits.jl/blob/27cb093439c8db5b6e59f75567800ff92d4fffa6/src/nodes/binomial_dist.jl#L70-L79

For GPU learning we need the following functions implemented:

let me know if you run into any other issues, will be happy to help.