Open johnmyleswhite opened 11 years ago
I think we should distinguish between transforming a distribution by changing its parameters (as you've described above) and by applying a linear transform to the variate (possibly producing a non-central version of the distribution). Both are probably useful, but they're quite distinct things to want to do.
I wonder how many other transformations it would be possible or useful (or possibly useful) to implement. We probably can't get all (or even most) of the relationships from the diagram in this paper but it might be a good thing to mine for ideas.
I also wonder if it would be possible to produce a function which parameterized one distribution by another. Usually this would just produce a hideous monstrosity of a compound distribution, but there would be some cases where it would simplify nicely.
Also, I should say that as distributions are, at present, simply objects wrapping their parameters, the proposed set of combinators are not really higher-order functions - they're just functions.
Yes, that's true: these are simply functions. The interesting property they have is closure of type. I've changed the title to reflect that.
I agree that we should distinguish between simple parameter switches and more general transformations such as linear transformations. The distributions graph does have some other nice examples: one could imagine doing things like exp(Normal(0, 1))
to produce the logNormal
type.
The existing MixtureModel type is one instance in which one function is parametrized in terms of arbitrary others. As you note, things get complicated quite quickly.
One function that we may or may not want is a function which simplifies distributions by noting special cases, etc. For example, we could have simplify(Beta(1,1))
producing Uniform()
and so on. I'm not sure whether this is something that should happen automatically. It would probably speed things up in some cases, but it could be confusing to users, especially if they're expecting to read the parameters out of a distribution object.
I think it would be great to have a simplify
function, but I also agree with your concern that performing automatic special case simplification will be more confusing than helpful.
It seems to me that most of the combinators we're talking about here are likely to produce weird non-standard distributions, and so auto-simplification of their results shouldn't be a big problem. I think it would probably be reasonable to simplify anything that's produced by a combinator, but to leave the type alone if it's coming out of the normal constructors. Either way, I think consistency is the most important thing - the user needs to either know exactly what they're getting, or be clearly warned that the result could be any kind of object with the right behaviour.
I realize that there's a serious issue we've forgotten to discuss here: Julia as a culture encourages all functions to return objects of a single type. This makes type inference much more efficient. For that reason, returning different types depending on the values of inputs (rather than their types) is something we should probably avoid.
That seems like a reasonable principle, and if it's generally being applied, I don't think the benefits to breaking it are great enough here to warrant doing so. I assume it's still okay that, e.g. different truncate
methods return different types.
Is it sufficient to isolate this kind of behaviour to the simplify
function? I think it would be good to have the facility, even if auto-simplification never occurs. Performance-critical code can simply avoiding using it.
There's no problem at all with truncate(d::Normal, l::Real, u::Real)
and truncate(d::Gamma, l::Real, u::Real)
returning different types. That's precisely what multiple dispatch is designed to handle well. The trouble is cases like simplify(Beta(1, 1))
returning a different type than simplify(Beta(1, 2))
.
I think restricting type uncertainty to the simplify
function is our best course of action.
@mewo2 had the really good idea of exposing a set of primitive for transforming and combining distributions. This issue is meant to expand on that suggestion and call for suggestions. Here are some basic functions on distributions that return distributions:
truncate
: Truncate a distribution at specified lower and upper boundsshift
: Alter the location parameter of a distributionscale
: Alter the scale parameter of a distributionupdate
: Perform a conjugate Bayesian update of a distribution in response to datamix
: Create a mixture model from any set of distributionsFor many popular distributions, multiple dispatch can be used to make these methods very efficient.