Update example notebooks

bartvanerp commented 1 year ago

Following up on #94 and our own observations about user-friendliness of the documentation, we need to improve the current examples, such that they do not only include code, but more a story line. We should update the examples such that they include the following points:

problem statement (narrative)
model specification (incl. factor graph)
inference (simulations)
results Note: if the model includes a specific/advanced node, such as AR, Mixture, etc, then the nodes should be clearly explained and documented with clear references to the node specification and rules in the code base.

For this issue we require the notebooks to be updated to a high level, such that we do not have to update them again in a couple months.

[ ] Coin toss model (Beta-Bernoulli) (@Sepideh-Adamiat )
[ ] Bayesian Linear Regression
[ ] Active Inference Mountain car
[ ] Chance-Constrained Active Inference
[ ] Assessing People's Skills
[ ] Gaussian Linear Dynamical System
[ ] Ensemble Learning of a Hidden Markov Model
[ ] Autoregressive Model (@albertpod )
[ ] Hierarchical Gaussian Filter
[ ] Bayesian ARMA model (@albertpod )
[ ] Infinite Data Stream(@Chengfeng-Jia)
[ ] System Identification Problem
[ ] Univariate Gaussian Mixture Model
[ ] Multivariate Gaussian Mixture Model
[ ] Gamma Mixture Model (@albertpod )
[ ] Universal Mixtures (@bartvanerp @albertpod)
[ ] Global Parameter Optimisation
[ ] Invertible neural networks: a tutorial (@bartvanerp )
[ ] Conjugate-Computational Variational Message Passing (CVI)
[ ] Solve GP regression by SDE (@HoangMHNguyen )
[ ] Nonlinear Smoothing: Noisy Pendulum (@Sepideh-Adamiat )
[ ] Nonlinear Smoothing: Rabbit Population
[ ] Nonlinear Virus Spread
[ ] Nonlinear Sensor Fusion (@bartvanerp)
[ ] Kalman filter with LSTM network driven dynamic(@Chengfeng-Jia)
[ ] Handling Missing Data (@bvdmitri )
[ ] Custom Nonlinear Node (@bvdmitri)
[ ] Probit Model (EP) (@HoangMHNguyen )
[ ] RTS vs BIFM Smoothing (@bartvanerp)
[ ] Advanced Tutorial (@bvdmitri )

bertdv commented 1 year ago

A suggestion on these examples.

I see 30 examples above. We have so many examples that refactoring the examples requires effort from everybody. The fact that we need to get all involved is a big warning sign to me. Once @wouterwln is ready with GraphPPL updates, are we going to ask everybody to update the examples again? And are we going to keep adding interesting examples? I am all for examples, but do we really need 30 examples to make clear how this stuff works? In principle, the recipe is simply (1) model specification, then (2) do inference. In my opinion, If we really need 30 examples to explain how RxInfer works, then something is wrong with RxInfer and we need to make RxInfer more user-friendly, rather than adding a very elaborate set of examples. I am just afraid that our code base gets too large for maintenance.

Here's my suggestion:

How about if we select the 10 most instructive examples that contain (>90% of) interesting patterns in model and inference specification and include these examples in the base code of RxInfer. These 10 examples should be very well documented and explained, why you do things in a particular way, and why an alternative way would not have worked. They need to be maintained meticulously, as they are key examples to teach developers how to work with RxInfer.

Then we can have a second set of 20 (or more) examples that do not belong to the base code of RxInfer. They can be interesting by themselves and we should keep them running. But I would put them in a separate package, eg, RxInferExampleZoo, that is optional for download. In general, when developing a toolbox, it is more important to keep our code base adaptable, than to keep it comprehensive.

Your feedback welcome!

albertpod commented 1 year ago

@bertdv, thanks for your feedback on the issue.

I like the RxInferZoo.jl idea, however, creating a separate package would require even more effort than just adding a narrative to already existing examples. We don't need to add more examples to the package for now. However, we need to either (1) remove or (2) add support the existing demos with a little bit of text. IMHO, for PhD students from our lab, it shouldn't require too much time and effort. Again, I want to remind everyone that committing to any of those bullets is not mandatory.

Please have a look at the demo on EP. This is an important demo but has no explanations or whatsoever.

As for @wouterwln changes in GraphPPL.jl, once it's ready, I expect the model specification to go under refactoring for each existing example (removing datavars and randomvars), but it won't touch the narrative or inference.

Also, please see a related issue #97, which can be viewed as the first step towards RxInferZoo.jl.

bartvanerp commented 1 year ago

I definitely agree with the point that quality is more important than quantity. For users the examples that we provide should be clear and should help to a thorough understanding of the toolbox. As we have noted in the past trough various accounts of feedback, transitioning to the modelling approach that RxInfer assumes, can be quite challenging. This problem can be alleviated by 1) universal inference with just a plug and play functionality 2) better guidance to users on how to tackle problems (paraphrasing #94). The first problem is more research based and takes longer to get working whereas the second point can be implemented already.

Yes, the notebooks should contain at its core the model specification and inference section, however, it should be clear to the reader what problem the notebook actually tries to solve. Without a proper problem statement the model specification and inference are obsolete. Now all notebooks phrase the problem statement very implicitly or not at all, therefore it will not be clear for outsiders what we are actually trying to achieve in a notebook. As @albertpod mentioned, just open 5 notebooks and you will see that likely none of them provide a clear storyline. The code currently provides the story, however, for new users the code does not (yet) make sense.

The amount of examples should indeed be reduced, definitely. There are currently about 14 node-specific notebooks, which should be transferred either to a separated part of the docs or somewhere else. With #97 we will look at the notebooks that are currently there and distill a list of basic examples that should serve as a crash-course of RxInfer

With the restructuring of GraphPPL I think the changes will be minimal and can be incorporated relatively quickly. This issue focusses mainly on increasing the quality of the notebooks.

albertpod commented 1 year ago

Related #97

albertpod commented 1 year ago

Close in favor of https://github.com/biaslab/RxInfer.jl/issues/97

ReactiveBayes / RxInfer.jl

Update example notebooks #98