should I continue to try with more ACBs?

stefanradev93 / BayesFlow

A Python library for amortized Bayesian workflows using generative neural networks.

https://bayesflow.org/

MIT License

297 stars 45 forks source link

should I continue to try with more ACBs? #105

Closed Shuwan-Wang closed 11 months ago

Shuwan-Wang commented 11 months ago

Hi Stefan et al.,

I am trying a model with 4 ACBs, and after training, the loss seemed to not moving down any more. But in validation, the posterior samples don't get close to the truth parameters.

So I wonder if I should try with more ACBs or anything? Do you have some suggestions?

Thank you so much! Best, Shuwan

stefanradev93 commented 11 months ago

Hi Shuwan,

I take it the installation worked and I can close the other issue?

It is hard to give a generic advice without knowing more about your setup, but if you think the inference network is underperforming, you can always use more coupling layers or switch to spline flows (couping_design='spline').

Shuwan-Wang commented 11 months ago

Hi, Yes, It turned out to be a compatiable issue with tensorflow. But it is solved now!

Thank you so much!! Best, Shuwan

Shuwan-Wang commented 11 months ago

Hi Stefan et. al,

I have a big concern when implementing the bayesflow. I think there is an example (Ricker model) in the paper that all the parameters are assumed positive uniform priors, then posterior draws for all parameters are all positive still within the range. But when I try just the simple regression with sigma^2 followed uniform positive priors, my posterior draws for sigma^2 is not all positive. I wonder that how can I make my posterior draws are positive within the assumed range? Is this regarding any NN structure? Can you give me some advice?

Thank you so much! Best, Shuwan

stefanradev93 commented 11 months ago

This simply means that the network is still not good enough. Since this is a cammon problem in practice, there are two straightforward ways to deal with out of bounds samples:

Reject the samples, as they have 0 density under the prior.
Transform bounded parameters into unbounded ones, e.g. learn log variance instead of variance. This is a common practice with many methods.

Shuwan-Wang commented 11 months ago

So if the network is good enough to translate the latent variable Zs to follow N(0,1), the amortized posterior draws will reflect within the proper range, right? I wonder the first option, have you discarded/rejected posterior samples in those examples shown in the paper?

Thank you so much! Best, Shuwan

stefanradev93 commented 11 months ago

Yes exactly. The models in the paper were well trained; In the OutbreakFlow paper, I did reject out of bounds samples, as this model was more challenging. In practice, we always suggest the second option - transforming the bounded parameters (e.g. variance) into an unbounded space (e.g. log variance).

Shuwan-Wang commented 11 months ago

Hi Stefan,

Thank you so much! That is really helpful! And in my practice, I examinate the latent variable Zs to see if they are well trained yet. GP_3ACB_Z1 GP_3ACB_Z2 The plots were comparing to N(0,1), clearly, one variable was closer, but the other one seems overtrained, right?

I used 3 ACBs for the NN, do you think it's overtrained? Should I downgrade to 2ACBs? Do you have some suggestions?

Thank you so much! Best, Shuwan

stefanradev93 commented 11 months ago

If you are concerned with overfitting, you could inspect the train / validation losses for any discrepancy, as in more traditional deep learning approaches. As long as the networks are not overfitting, they cannot be "overtrained"; they most likely need more expressiveness to shape the latent space.