ppdebreuck / modnet

MODNet: a framework for machine learning materials properties
MIT License
76 stars 32 forks source link

How to sample from the posterior distribution using the Bayesian module? #73

Open sgbaird opened 2 years ago

sgbaird commented 2 years ago

I didn't notice anything obvious within modnet/models/bayesian.py

ppdebreuck commented 2 years ago

The Bayesian module is somewhat deprecated, as the results didn't turn out as good as hoped and the EnsembleMODNetModel should be preferred.

Nevertheless, sampling can be done using model.predict(x) (single outcome, should be repeated to form Monte-Carlo experiment) or model(x) giving a tensorflow_probability.distributions.Distribution. With model the Keras model (not the MODNet one). Currently the BayesianMODNetModel takes 1000 samples and computes the mean and std.

https://github.com/ppdebreuck/modnet/blob/6a5b139f3771201ca1e70e6da1531d66b1dd265c/modnet/models/bayesian.py#L316-L320

Anyway, if you have some ideas to improve, or successful applications with this module, I would be happy to discuss !

sgbaird commented 2 years ago

Ah, gotcha. Thank you! That's too bad about the results. I've been looking into converting (my fork of) CrabNet into a Bayesian transformer network via e.g. BayesFormers, but it seems like it could be pretty involved, so I was pleased to see MODNet had some Bayesian facilities already in place.

Have you thought about trying bayesian.py, ensemble.py, and FitGenetic under a single routine (i.e. "ensemble of genetically optimized Bayesian models")? In this case, sampling from the posterior might just look like taking the average of a single posterior sample from each model within the ensemble.

ppdebreuck commented 2 years ago

No, I haven't spent much time on the Bayesian model to be honest. Might be something to explore in the future ...