Closed TTitscher closed 2 years ago
Why should we try to "hide" the individual setting required for a specific sampler from the user (that relates to the input - not the inference problem, but the solver options such as burn in, number of chains, etc. . I don't see a good option to define different levels of accuracy that work for any problem (e.g. the choice of the burn in phase depends on the dimension of the variables, the priors itself as well as the problem at hand i.e. how nonlinear the model is). As such, I would create all the solvers in your example individually (with the relevant setting defined by the user) and only then loop over them afterwards. The settings defined in the constructor might potentially be overwritten in the run script as shown, though I'm not sure if this is really necessary. The output however seems standard for the sampling procedures, for the variational methods we probably have to think about a different format, since we actually sample (or optimize as in our VB) the hperparameters describing the posterior instead of directly sampling from the posterior.
First of all, I have no problems in abandoning the idea.
I like to stress that nothing is explicitly hidden and the user is free to change the parameters. But to follow that wording, I assume that dynesty 'hides' the vol_dec
argument or emcee 'hides' the thin_by
arg to not expose all options to the user, for convenience. And in probeye, we also provide defaults. So my proposal is more like a collection of reasonable default arguments that hopefully encourage users you try different solvers.
If your concerns regard the implementation, we can put those args in a staticmethod, so within EmceeSolver
could be a
@staticmethod
def defaults(level, ndim):
if level < 1:
return {"nsteps": 100*ndim, "nburn": 25*ndim, ...} # or ndim**2 ?
if level < 10:
return {"nsteps": 1000*ndim**2, ...}
that needs to be called explicitly
solver = EmceeSolver(...).run(Emcee.defaults(10))
Don't get me wrong, in general I think it is good to provide default values that the user can change. However, I'm not fully sure how we could reasonably set them (from my experience that is always a trial and error).
To summarize conversation with @joergfunger:
run
method of each solver should be called run
(instead of run_scipy, run_emcee, ...
) In the last version (3.0.0) I changed the solver interface so that they all have a run
method to call their main functionality.
Problem: Inhomogenous inputs/outputs
It is hard to come up with an interface for our solver classes as they vary both in output:
and input:
As most of the outputs can be resampled to unweighted samples (and stored in arviz.InferenceData), that issue is almost solved. And the expert user has access to the original results in a
Solver.raw_results
member.Proposal: Common accuracy levels
For the input, we could think of some accuracy levels similar to log levels:
The exact values need to be determined in the solver and be somehow well defined.
Possible interfaces
An interface would then look like
and they would be truly interchangeable
and remain customizeable:
Discussion
high, medium, low
?