ICB-DCM / pyABC

distributed, likelihood-free inference
https://pyabc.rtfd.io
BSD 3-Clause "New" or "Revised" License
205 stars 44 forks source link

AssertionError: The total population weight is zero #560

Closed sgrzegorz closed 2 years ago

sgrzegorz commented 2 years ago

I am using a set of differential equations from https://pubmed.ncbi.nlm.nih.gov/22761472/ to predict cancer development. I wrote the model used in the article and the model weights. Then I ran odeint equation to receive observation data.

Now I "forget" weights and I am trying to find them "again" using observation data and pyabc library. And I get "AssertionError: The total population weight is zero" full error msg <- basically I get this error whatever modification of distance function or model or prior parameters I try.

Increasing a population_size is not helping. Here https://github.com/ICB-DCM/pyABC/issues/534 was a similar issue, but I did not found a solution for my problem there

Code: https://github.com/sgrzegorz/ribba : ModelTraining3.py pyabc code CancerModelClass.py model of differential equations

demo.py generating artificial data and saving it to sztucznyDemo.csv

yannikschaelte commented 2 years ago

I have replicated the problem using synthetic data observation = model(prior.rvs()). It appears that in your case the ratio prior/transition of prior (const. at 2.38e-6) and transition kernel (--> 1e0 - 1e4), which gives the importance sampling weights, becomes very small due to high-dimensional transition concentration.

We will very soon modify this piece of code to only give a warning in that case, and throw an error only if the total weight is really zero. One could conceptually bring priors and transition densities onto the same scale, which would however require further modifications.

I think that in your case the problem is that the posterior has essentially converged already, leading to very narrow diagonals in the 2d plots below of (non-identifiable) parameters, which the multivariate normal transition kernel picks up as effectively a lower-dimensional distribution with zero variance in some directions. This means that you can stop your analysis before reaching this point, after which no further insights can be gained (or extract from the database whatever result you have before things crashed). However, in general you may want to account for measurement noise in your (so far deterministic) model, see e.g. our paper https://academic.oup.com/bioinformatics/article/36/Supplement_1/i551/5870512. Unless you are sure you have no noise in your data. If you do so, the zero population weight problem should anyhow no longer occur. out

yannikschaelte commented 2 years ago

Changes have been implemented in develop branch, via #563

sgrzegorz commented 2 years ago

Thank you for your quick response and explanations. 1) Quick fix. For the time now (on my pc) I have changed: raise AssertionError("The total population weight is zero") to print("The total population weight is zero"). When this message appears I just ignore it and wait for the simulation end. Then:

posterior = pyabc.MultivariateNormalTransition()
posterior.fit(*history.get_distribution(m=0))
t_params = posterior.rvs()
print(t_params)

The ground truth plots and plots of ode with (t_params) parameters estimated by pyabc look the same.

2) The correct way would be adding noise to ground truth eg.:

mu, sigma = 0, 0.001
noise = np.random.normal(mu, sigma, [len(df.P)])
df.P.min = list(df.P) + noise

Afterwards I should use pyabc with Transition kernel . And then the zero population weight problem should disappear.

yannikschaelte commented 2 years ago

Sounds about right! Note if you have a deterministic model, you may also want to explore likelihood-based ABC alternatives, such as pymc3, emcee, pypesto. Closing for the moment, feel free to re-open if issues persist.