TuringLang / AbstractPPL.jl

Common types and interfaces for probabilistic programming
MIT License
24 stars 6 forks source link

Feature requests and use cases #3

Closed mohamed82008 closed 1 year ago

mohamed82008 commented 5 years ago

Here are some features and use cases that @cscherrer brought up on Slack.

yebai commented 5 years ago

Thanks, @cscherrer. These are very nice features to have in a PPL like Turing/Soss. I think Julia provides an excellent platform for implementing these features, which is also why we picked Julia as the working language for Turing. Some features are already on Turing's TODO list (e.g. composability https://github.com/TuringLang/Turing.jl/issues/86, easy switch between prior, posterior, likelihood https://github.com/TuringLang/Turing.jl/issues/634). Other features might require changes on the compiler side (e.g. inlining data), or adding new inference algorithm support (e.g. variational inference, Kalman filtering), or utility functions (e.g. posterior predictive checks). It would be great to collaborate so that we can build a powerful PPL library in Julia.

mohamed82008 commented 5 years ago

About inlining data, I am not sure this is always a good idea. Inlining means that the code size scales up with the amount of data, so the compilation time also scales up the same way. For small datasets, it may give a decent speedup but for large-ish datasets, I am skeptical of the benefits.

Either way, I can think of 2 ways to (ideally) get this inlining behaviour in Julia without any changes on Turing's side:

  1. Use a Tuple data, and trust Julia's constant propagation (may only apply if the whole thing is wrapped in a function).
  2. Encode the data in the type system. For example:
    struct InlinedData{D} end
    Base.getindex(::InlinedData{D}, i...) where D = D[i...]

Then we can define and sample from the model using:

@model gdemo(x) = begin
         s ~ InverseGamma(2, 3)
         m ~ Normal(0, sqrt(s))
         x[1] ~ Normal(m, sqrt(s))
         x[2] ~ Normal(m, sqrt(s))
         return s, m
end
data = (1.5, 2.0)
inlined_data = InlinedData{data}()
model = gdemo(inlined_data)
sample(model, HMC(1000, 0.1, 5))

While the above code should work right now in Turing, I am not sure if it performs as expected or not. I can look into it later.

mohamed82008 commented 5 years ago

Another feature brought up in https://discourse.julialang.org/t/probabilistic-programming-repositories/19125/6 is custom transformations. IIUC, this should be possible with a callback. So we can have the option of the users specifying the kind of transformation function they want to use which they can define themselves. Even further, the callback can accept the current state of the model, e.g. the already sampled parameters and the data as inputs, so the advanced user can condition the choice of the transformation on the data or other random variables. @tpapp

tpapp commented 5 years ago

Can you give a mock example with the callback syntax? FWIW, I would just allow a function.

mohamed82008 commented 5 years ago

Yes, it will be a function. So something like:

@model logpdf_with_trans model_generator(x) = begin
   ....
end

where logpdf_with_trans is a function that does:

function logpdf_with_trans(dist, x, data)
    ....
end

dist is the distribution, x is the random variable, and data will be a named tuple of the input data. This is the rough idea. There may also be better syntax for this. You may want to do something special for a specific distribution and default to the standard Turing.logpdf_with_trans.

trappmartin commented 5 years ago

If I understand correct, the custom transformations feature request is actually a request for Bijectors.jl? Maybe we could provide a nice interface in Bijectors.jl that is flexible enough to allow users to define custom transformations? I'm not convinced this has to be part of Turing.

mohamed82008 commented 5 years ago

That's another option I guess, and any data dependence can be sorted out outside the model definition. A MWE would help us make the best call.

tpapp commented 5 years ago

Just to clarify: with custom transformations, in my experience the costly and nontrivial part of the computation is the log Jacobian adjustment. You can use AD, but it is easy to run into type stability issues unless you are careful configuring ForwardDiff, and even then it can be very slow.

The best route I have found for complex models is to have the user code both the transformation and the log abs det of the Jacobian, and provide a function to verify this using ForwardDiff.

trappmartin commented 5 years ago

@tpapp Have you had a look at https://github.com/TuringLang/Bijectors.jl which I think does what you are concerned about but currently doesn't provide an interface to easily extend the transformations with custom transformations. @yebai and @willtebbutt have more experience on this.

mohamed82008 commented 5 years ago

Moving away from AD, and making it the default, is an option but it may require some significant redesign of Turing and Bijectors internals.

tpapp commented 5 years ago

@trappmartin: yes, I have, it seems to contain manually coded Jacobians.

Also, for some fairly complicated transformations, intermediate results can also be useful for inference, so it is difficult to separate transformation and likelihood.

mohamed82008 commented 5 years ago

Also, for some fairly complicated transformations, intermediate results can also be useful for inference, so it is difficult to separate transformation and likelihood.

May you give an example?

datnamer commented 5 years ago

Have you all seen the design of edward 2? https://arxiv.org/pdf/1811.02091.pdf

@cscherrer

denisshepelin commented 5 years ago

Hi, it would be very interesting to me to see how deterministic equations can be embedded into the model and what are requirements (Jacobian adjustment?) for that. Example of similar issue can be found here - https://docs.pymc.io/Advanced_usage_of_Theano_in_PyMC3.html?highlight=root. There authors describe how to write a function that computes root of function and use it in probabilistic model. Thank you!

mohamed82008 commented 5 years ago

Hi @denisshepelin, I think any generic enough pure Julia function should already be supported in Turing. Usual perks apply though, e.g. make sure the function is (sub-)differentiable in the mathematical sense if you want to use a Hamiltonian sampler. Please try out your use case and if it doesn't work, report in an issue or on the Julia slack #turing channel.

femtomc commented 3 years ago

Tracking this thread too :) Will add some comments later. I'm assuming it's being revived?

cscherrer commented 3 years ago

I'd forgotten we had talked about inlining. I guess you may all have seen it by now, but I presented some progress on that idea here: https://informativeprior.com/blog/2021/01-25-symbolic-simplification/