pymc integration - Githubissues

aloctavodia commented 2 years ago

Hi!

Here I have some code showing how to use PyMC to define a model and then perform inference with PyVBMC. The core functionality is mostly in the initialize_target function, which eats a PyMC model and returns a compiled Aesara function representing the joint log density of the model. Aesara is the "computational backend" of PyMC. I have included and option to return the log prior and log likelihood as separated objects, as you mentioned that you may want to use that as some point.

Then I have included a couple of functions to automatically compute initialization points and boundaries. Probably you have better ideas on how to do this. It would be nice to have an automatic robust method of initialization. Currently I am only running one instance of VBMC, but as you mention in the documentation, ideally users should run separated "chains" starting from different points. PyMC has some logic to generate such "jittered" initial points, so maybe we can use that here too.

As you may know, PyMC Stan and others transform bounded variables to unbounded space, inference is performed, and then parameters are transformed back to their original space. Could that be useful here so you don't need to deal with hard boundaries? Also if a model has two variables a Normal and a HalfNormal, it is possible to pass [-np.inf, np.inf] for the normal and [0, some_number] for the HalfNormal? From the documentations I understood that this was possible (and that [0, np.inf] was not allowed), bad when I tried such a model I got an error? (Maybe the input was not in the correct form, I did not try too hard to fix it or read the code, haha).

There is also a function to return an ArviZ's InferenceData object, clearly this approach is a little bit restrictive as we have to pick the number of samples in the InfereceData object before inference. But as a first step maybe is not that bad.

Finally there is a notebook with 3 examples, mostly to check that the "seems to be working"...

As expected there are many things that could be changed at this point from the API to the internals. So I am not sure how you want to proceed. Maybe you want to review the code until is acceptable for merging or you prefer to take it as a general guide and work yourselves on a version that better suit your needs. I am totally fine with both alternatives.

pipme commented 2 years ago

Looks nice to me! Thx.

lacerbi commented 2 years ago

Thanks for the PR!

The approach with an incapsulated function proposed in this PR is nice as it streamlines things, but it can potentially cause issues down the line, for example letting the bounds to be determined stochastically could be an issue.

A bigger question is how much should be incapsulated in the function as opposed to simply giving the user a handle to a target log-density (and possibly also some recommendations for the bounds) - which then the user can use just like they'd use any user-defined target.

Typically, I would prefer the handle approach as it is clear for the user (and for us) what's going on and there. Still, this gives us a very good starting point to work on!

@Bobby-Huggins: We should discuss this at some point.

aloctavodia commented 2 years ago

You are welcome.

This approach is based on what a user of pymc (and maybe other ppls) expect from an inference method. Those users expect to define a model. Then inference proceed automatically, including the initialization of the sampler and tuning of hyper-parameters. Tweaking of hyper-parameters/initialization is only needed by the users if diagnostics show problems. From this expectations, asking the user to set up bounds sounds weird.

I guess you have better ideas, but I leave here some comments in case you find them useful. In principle bounds could be defined deterministically from the distributions used in the models (assuming you allow mix of finite and infinite bounds). Could VBMC be modified to work on the unbounded space? Could stochastic bounds and stochastic initialization be used to start different "chains/runs", so you can assess convergence?

BTW, feel free to close this PR

lacerbi commented 2 years ago

This approach is based on what a user of pymc (and maybe other ppls) expect from an inference method.

Fair point - I was thinking more of an user of pyvbmc that wants to define a model through the machinery of existing PPLs, rather than vice versa. I had not thought much of the converse. You are right that expectations are different.

Those users expect to define a model. Then inference proceed automatically, including the initialization of the sampler and tuning of hyper-parameters. Tweaking of hyper-parameters/initialization is only needed by the users if diagnostics show problems. From this expectations, asking the user to set up bounds sounds weird.

Not sure why. The hard bounds are (if anything) defined by the model (e.g., probabilities are defined from 0 to 1, etc.). I know that pymc or Stan then transform everything into an unbounded space (which is what pyvbmc does too), but that's an algorithmic step.

Plausible bounds can be set up deterministically from the priors so we don't really need to ask the user.

I guess you have better ideas, but I leave here some comments in case you find them useful. In principle bounds could be defined deterministically from the distributions used in the models (assuming you allow mix of finite and infinite bounds).

Oh, yes, that's exactly how I was thinking of doing. :)

Could VBMC be modified to work on the unbounded space?

No need to modify it, VBMC already supports bounded and unbounded parameters.

Could stochastic bounds and stochastic initialization be used to start different "chains/runs", so you can assess convergence?

Yes, that's already part of the basic recommendations for VBMC. A few comments:

The hard bounds cannot change across runs, because those are part of the model (you are changing the prior, by truncating it at different points). So runs with different hard bounds would effectively be runs of (slightly) different models.
The plausible bounds maybe can change, as they are not technically changing the model but just some initialization and hyperparameters. Randomizing plausible bounds is not something I had thought of before, we'll have to think about it a bit more to check that it doesn't create issues downstream.
- FYI, the reason why I am unsure is that changing the plausible bounds might change the transformation from bounded variables to unbounded (I need to check, I do not remember how I set it up at the time). This means that runs of the same model with different plausible bounds might end up with incompatible unbounded representations, and that's not desirable for a bunch of technical reasons. Of course, you can always map the parameters back into their original space, but it's very convenient if all unbounded representations are the same.
The basic way to randomize the initialization is via the choice of x0 (starting point of the run), and that's definitely very feasible (e.g., by sampling from the prior, or drawing x0 from inside the plausible box).

BTW, feel free to close this PR

I'll leave it open for now as a reminder for us, as it is definitely something we want to include in one way or another - after we have fixed another bunch of other stuff. Thanks again for the inputs!

acerbilab / pyvbmc

pymc integration #73