liesel-devs / liesel

A probabilistic programming framework
https://liesel-project.org
MIT License
40 stars 2 forks source link

Evaluate what parts of the Bayesian Workflow to cover #56

Closed jobrachem closed 1 year ago

jobrachem commented 1 year ago

Arose from

Bayesian Workflow paper: https://arxiv.org/abs/2011.01808

jobrachem commented 1 year ago
jobrachem commented 1 year ago
GianmarcoCallegher commented 1 year ago

Can we close this issue?

GianmarcoCallegher commented 1 year ago

On 07.06 we will discuss again about this issue

GianmarcoCallegher commented 1 year ago

The evaluation of the Bayesian Workflow took place on 15.06. We went through the list of aspects that could become features of Liesel, commented on them and gave our opinion on their priority for the next months/semesters. Here they are:

What aspects of the Bayesian workflow could become features of Liesel?

  1. Prior and posterior predictive simulation (Section 2.4 and 6.1)
    • There are some theoretical concerns, but the functions will undoubtedly be useful in many situations
    • Dr. Hannes has done previous work, will be added to main sooner or later
  2. Pathfinder, better initial values (Section 3.1)
    • VI approach to find some initial values, alternatively some other optimization algorithm (+ jitter) could be implemented
    • The consensus is that some kind of optimization algorithm for Liesel models would be helpful and could be based on optax, jaxopt
  3. Faster, approximate inference algorithms to speed up model building (e.g. VI, Section 3.3)
    • Closely related to the implementation of a simple optimization algorithm, but the long-term goal could be something like a Goose equivalent for VI
    • The long-term is probably not realistic for the four of us in the next couple of months/semesters
    • Sebastian's master thesis is related, so let's wait for his results
  4. Run MCMC until $\hat{R} < 1 + \varepsilon$ (Section 3.2)
    • Could be implemented as a callback in Goose checking "some" criterion and deciding whether to stop the sampling (and communicate the reason)
    • Noone depends on this critically, but it would be really nice to have and could be a real timesaver/convenience
  5. Early stopping/failure criteria in Goose (Section 3.4)
    • Could be implemented like (4), but here the checks are model-specific or kernel-specific rather than chain-specific
  6. Stacking to reweight poorly mixing chains (Section 5.5)
    • Addressing the question: How can we learn the most from a given MCMC fit even though the chains/mixing are not perfect?
    • Should be relatively easy, maybe a student could work on it in the Statistical Practical or the Advanced Bayes seminar
  7. Automatic marginalization of discrete parameters (Section 5.8)
    • How often do we encounter discrete parameters? At least sometimes: Spike & Slab, Tensor Product Interactions, Species Occupancy
    • What is the cost of the alternatives, i.e. manual marginalization, and how often is automatic marginalization even possible?
    • Not very high up on our priorities
  8. Cross-validation (Section 6.2)
    • Is this a task for Liesel?
    • The design could be based on something like the DataLoader from PyTorch
    • Could be implemented as a wrapper for the DataLoader in a Liesel node/variable
    • Gianmarco has a related implementation that he uses for VI
  9. Working with multiple models, model stacking and averaging (Section 8.2)
    • None of us has ever worked with model averaging etc., so it's probably not a priority for us
    • But philosophically it seems nice

Thank you @hriebl