Closed RaphaelS1 closed 5 years ago
Yes, this is an interesting question - sorry that we haven't had the time yesterday to discuss this through.
I know of no precedent where this would be solved well, but it probably exists since parameterization is a general problem in scientific computing.
Here's one suggestion (not necessarily the best one):
getSettableParamsFrom
that takes as an argument a named list, of parametername-value-pairs. The names can be different, but the return value is always a named list of parametername-value-pairs where the parameters in question are the "default" parameters.
For example getSettableParamsFrom(mu = 1, std = 2)
would return list(mu = 1, var = 4)
.
Arguments should always be a full specification of the parameter set.setparams
is always called with default parameters.
If setparams
is called with "non-default" parameters, it calls getSettableParamsFrom
first.There's one downside to the above: it does not allow changing sub-sets of "non-default" parameters that map bijectively onto subset of "default" parameterse, e.g., change standard deviation, or variance, in isolation.
This could be added in, in that getSettableParams
, on occasion returns a list with only some default parameters but not all.
However, this cannot be generalized to a case where there is not a subset correspondence, as there are cases where the setting of parameters depends on a choice of "what to leave fixed". Consider as an example the case where you have 10 parameters a0, a1, ..., a9, all of them non-negative and adding up to the number one. This could also be parameterized by sums of any pairs of the ai. It's ill-defined to say "change [a0 + a1] to 0.5".
Okay so if I understand this correctly, the following would occur:
Normal$new(mean = 0, sd = 1)
private$.getSettableParamsFrom(list(mean=0, sd=1))
. This function internally runs something like
sd2var = function(sd) return(sd^2)
var = sd2var(1)
return(list(mean = mean, var = var))
My questions then:
getSettableParamsFrom
needs something like sd2var
, prec2var
,var2sd
,var2prec
, etc. If we want to show the values of each in the ParameterSet
settable
parameter? If the user can pass any parameter to setParameterValue
and then getSettableParamsFrom
will adjust accordingly, then we don't need to restrict the user to which parameters they can set.lm
with different arguments. This may be a bit cumbersome with S3 (bit of manual coding), but is doable with S4 or R6, I believe.No, I think the information which parameters are settable is still valuable, since it defines which parameters the interface passes to the outside - e.g., if you use the distribution in inference, tuning, etc. It's a "CS-modelling" statement about the scientific type, rather than a pure interface functionality (in which case it would be obsolete, as you conclude).
But then perhaps "settable" is the wrong term. To me "settable" implies that the user isn't able to call setParameterValue(param)
if param
is not settable. But in the above scenario they can call setParameterValue
on any parameter (even if not used in construction)
Not sure. However, this question highlights an important point, namely two distinct cases, where (a) the user interacts with the distribution (b) some other machinery, e.g., estimator or learning machine, accesses the distribution.
In case (a), I think one can make a case that it should behave as a user naive to the interface but expert in the methodology would expect. That is, it would set all the parameters while the "settables" remain as they are. In case (b), you need some flag for a learning machine which parameters it should consider the distribution as parameterized in.
In case (b), you need some flag for a learning machine which parameters it should consider the distribution as parameterized in.
However as all the formulae are based on the default parameterisation (as per your first argument), then arguably it doesn't matter what a learning machine does; as settable refers only to "can this parameter value be updated". Which is not the same as "can this parameter value be estimated" i.e. fittable
Regarding (b), there can be a difference in what an estimator/learner does algorithmically with a distribution, depending on the parameterization. Example: gradient-descent based MLE for fitting parameters, e.g., whether you do this in terms of variance, precision, or log-variance.
Understood. My concern is as follows, whilst I agree the current parameterisation method is messy and could benefit of an abstraction of the ParameterSet from the constructor, there is a physical limit to how much we can abstract, more precisely: due to R6 limitations a distribution's ParameterSet has to be in the constructor and not as a public variable. This means that in the same constructor there has to be a set of conditionals identifying which parameters are "settable" based on how the user constructed the distribution (again this has to be the case as R6 doesn't handle overloading of constructors). Therefore I'm not sure if a getSettableParams
method adds anything as we still have the same problem
It would be a generic - in construction you set once which parameters are settable, these may not be the "default" ones. Internally, this is always converted to a representation in terms of the "default" ones.
The added value is that this design allows the same object to behave, seen from externally, like "distribution with settable parameter set [theta]" for different theta, such as mean/variance, or mean/precision.
How it behaves (i.e., which theta) would be determined in construction. Internally, you would always have conversion to the default parameter set.
Okay.. so semi-pseudo-code for the above as follows:
In the Normal class we have the private method:
.getSettableParams <- function(mean,var,prec,sd){
if(!is.null(var)) return(list(mean=mean,var=var))
if(!is.null(prec)) return(list(mean=mean,var=prec^-1)
etc.
}
Then the constructor:
initialize <- function(mean,var,prec,sd){
parameterset <- function(...){
if(!is.null(var)) var = settable, prec = not_settable, etc.
}
setParameterValue(.getSettableParams(mean,var,prec,sd))
}
Yes - plus possible input checks, in case the user doesn't pass a mean, say, or multiple contradictory values for dispersion parameters.
Okay, agreed. I think getSettableParams
should be a private method that only required distributions have (as opposed to S3).
Writing this up I'm not sure it is actually more efficient. Currently ParameterSet
is quite fast and efficient (aside from the constructor) problem because of the updateFunc
argument and the update
method that dynamically updates based on any parameters. In the suggest method above, each time a parameter is updated we have to call getSettableParams
which has to choose the correct functions to update other parameters and then all the validation checks on the given arguments (e.g. class, boundaries) are performed.
A better compromise may be to have .getParameterSet
which stores all the possible variants, with settable
and updateFunc
included and then this is called as one line in the constructor, and all the checks are performed within this.
Hm, @PeterRuckdeschel @stamats - any good suggestions or lessons learnt from your ParamFamily
design?
I have just implemented the following in the Normal distribution:
ParameterSet
.setParameterValue
method for distributions now first internally calls .getRefParams
setParameterValue
method for ParameterSet
no longer checks to see if a parameter is "settable" before updating and all update functions now refer to the reference parameter only.There is no logical flag for the reference parameter, e.g. no "reference" col in the paramset with TRUE/FALSE as elements, but I think this is probably easy to deduce by looking at "updateFunc".
Finally added an option "verbose" so users can turn off messages that appear in construction telling the user which parameterisation is used.
How does this sound?
Makes sense to me - let's see whether it's easy to maintain and/or easy to use.
Sorry for jumping in late -- you indeed hit a very interesting topic where a good design could really pay off later on.
I think the way you address the topic is fine. Actually, also the idea to have a default / reference parametrization which is used by all setters is a good concept.
To solve the issue that the information given as argument is not yet sufficient to identify the parameter could be solved as follows: starting from a reference (complete) parameter expressed in the reference parametrization you could then allow to only change those aspects in the reference parametrization which are affected by the setter. Doing so you indeed can use polymorphism / method dispatch as is done in S3 and S4 methods to allow setters to react on different named arguments.
Second I would suggest that you should try to allow to (optionally) bind gradient information to the setter: If the reference parametrization is expressed in terms of theta and you have a mapping tau(theta) which (in some possibly different parametrization) moves theta to a parameter theta' (in reference parametrization), for gradient descent type algorithms you might want to have information on its derivative tau'(theta), or if 2nd order algorithms are to be used, even the Hessian tau''(theta) could be helpful.
Some of this is covered by the concept of partial influence curves (see chap. 4/ Def. 4.2.10, Rieder (1994): Robust Asymptotic Statistics. Springer) which are meant to cover the situation that one is interested in the (smooth) aspect tau(theta) where theta is some arbitrary smooth transformation of tau with derivative tau'(theta). In distrMod, this is captured by slot trafo in S4 class "ParamFamParameter".
Sorry, crossing posts - this is referring to @RaphaelS1 's earlier post (pre- @PeterRuckdeschel )
Hm, good point - this is an inconsistency that may bother users in the rare (?) case they construct a distribution "this" way and want to change them "that" way.
Do you think we should prevent users from doing that, so both users and other parts of the interface are prevented/allowed changing parameters the same way?
I'm wondering though whether we wouldn't be introducing a source of botherance in order to remedy a subjective design inconsistency?
Just a little comment from "outside":
I think the "source of botherance" Franz mentions is worth it -- a good idea would probably be to have a clearly communicated standard setter, but besides you should also provide alternative setters, even if they come with extra efforts through checking overheads, finding the right mappings etc.
Let me try to convince you that this is not a subjective design inconsistency but will happen quite frequently, whenever you consider functions tau(theta) instead of theta itself. With this in mind, it is a natural situation, and you want to be able to deal with it.
As an example look at the situation where you want to have a recursive update procedure for aspects tau(theta), or even more concrete, you want a recursive scheme for updating parametric normal quantiles. The reference parameter space is built by theta=(mu, sigma), mu in R, sigma >0, the parametric quantile would be q=tau(theta)=mu+sigma*q0, where q0 is the (known) quantile of N(0,1), mu, sigma in the end need to be estimated from data.
For whatever reason you have noticed a shift in the quantile from q to q' and want to move your distribution accordingly.
To see what happens, it helps to linearize tau about the starting theta0=(mu0,sigma0), i.e., tau(theta)=tau(theta0)+ d/dmu tau(theta0) (mu-mu0)+d/dsigma tau(theta0) (sigma-sigma0) + ignored remainder. On first sight, as you are only interested in changes in the quantile, this only seems to affect the subspace (of the reference parametrization) induced by the difference Delta(tau(theta)) from one quantile estimate to the next one. But then, it may easily happen that the new combination [changed aspects in reference parametrization induced by Delta(tau)] x [unchanged aspects in reference parametrization] is no longer admitted in the parameter space (in our case, e.g. meaning that you cross 0 for sigma), so changes induced by Delta(tau) may trigger further changes in the kernel space of Delta(tau).
Other natural examples with similar issues cover multivariate normals where you either have the covariance matrix as parameter or the standard deviations and the bivariate correlations, or, third, for model selection, it is not uncommon to parametrize this distribution by the inverse covariance matrix (as 0 in there indicate conditional independence in a graphical model).
So all painful features Raphael mentions i.e., choose the correct functions to update other parameters and then all the validation checks on the given arguments (e.g. class, boundaries) are not signs of bad design but arise in a natural way.
Hi @PeterRuckdeschel , your comments are very much appreciated, especially as we finalise design choices that we don't want to regret down the line, such as this!
I believe what I have now implemented covers the basics as suggested, however my concern is whether the current method is overly simple.
The method works in three parts:
ParameteSet
shows all possible parameterization which update accordinglyThese are all linked by the functions: getParameterSet
, getSettableParams
, update
and setParameterValue
. I believe these should aptly cover your example above but perhaps @fkiraly you could confirm this if/when you've had time to look at the code?
I am completely fine with this -- it is both efficient and consistent. The only point I was trying to make is to allow for other setters, if needed, and that your infrastructure can cope with these alternative setters later on.
@PeterRuckdeschel I think we disagree about whether we disagree: my comment about botherances was only about whether non-settable parameters should still be settable by the user or locked to the user.
Just to be clear, I think we don't disagree on key requirements and high-level interface.
I agree with you, and @RaphaelS1 (and indeed suggested sth along these lines earlier in this thread?) that distributions should be accessible through different parameterizations, but of course the user/API needs to state clearly which parameterization to consider the distribution in.
@RaphaelS1 , where are getParameterSet, getSettableParams, etc?
(and indeed suggested sth along these lines earlier in this thread?)
Credit you fully with this design!
getParameterSet
is a script called getParameterSet.R with the generic/dispatch methods. update
is in ParameterSet.R. setParameterValue
is in ParameterSet.R with specific distributions overloading this. getSettableParams
is a private method in each class definition
distr6 lets the user choose distribution parameterisation by allowing the user to construct the distribution using any combination of common parameters. For example the Normal distribution:
In doing so, all other parameter combinations must be updatable. For example in using
Normal(mean, sd)
the parametersvar
andprec
remain in the parameter set and are updated automatically using the functionssd^2
andsd^-2
. And analogously when constructing with var or prec.The problem:
Currently all the parameterisations are hard-coded with update functions into the constructor and the correct one is chosen depending on the constructed distribution. This is inefficient and somewhat messy. One alternative would be to name each
ParameterSet
for exampleNormVar
,NormSd
andNormPrec
and then select the correct one in construction. Although this may add an unnecessary layer of abstraction...