Cf. LogDensity.jl - Githubissues

phipsgabler commented 3 years ago

A couple of days ago, I created LogDensity.jl, with the intention of using it in AbstractPPL (the future abstract interface for DynamicPPL, Turing's compiler part), and offer a common logdensity interface for other PPLs and "measure-like" things. (LogDensityProblems is too specific for that.)

Obviously I haven't done my homework and was made aware of this package only later :D Anyway, I think we do have a lot of common goals, so I would be ready to join efforts.

One complication we have is the question of how to design an interface that bridges over to MeasureTheory.jl, but still avoid burdening non-measure-theoretic packages -- see this issue.

I also see you have already discussed a bit the "currying question" in #1; I have picked that up in our discussion, because I think it is an important design point.

What are the down-stream usages and compatibility goals for DensityInterface?

oschulz commented 3 years ago

Hi @phipsgabler ,

I would be ready to join efforts. What are the down-stream usages and compatibility goals for DensityInterface?

You're very welcome to join in! My plan was to propose moving DensityInterface.jl to JuliaStats (or possible JuliaMath) soon to make it a real community-wide interface, similar to what I and @devmotion recently did with InverseFunctions.jl and ChangesOfVariables.jl. I still have to add a generic interface test function, then my plan was to submit a PR to Distributions.jl to make distributions support the density API.

intention of using it in AbstractPPL (the future abstract interface for DynamicPPL, Turing's compiler part),

Yes, please! :-) BAT.jl (master branch) already supports the density interface, and I always had intended to propose supporting Turing models as well.

I also see you have already discussed a bit the "currying question"

So I take it you're in favor of "currying"? I do wanted to have it not just because of convenience reasons, but because this way logdensityof can return the original closure/function. for densities that originate from log-density-closures/functions.

oschulz commented 3 years ago

CC @cscherrer

oschulz commented 3 years ago

CC @devmotion

cscherrer commented 3 years ago

Thanks for the ping @oschulz :)

Have you been tracking MeasureTheory.jl at all? It's important for that work that we have a three-argument form available for logdensity, defined in terms of MeasureBase.basemeasure.

oschulz commented 3 years ago

how to design an interface that bridges over to MeasureTheory.jl, but still avoid burdening non-measure-theoretic packages

I'm not an expert on measure theory, but from what I could gather from phipsgabler/LogDensity.jl#1 the question is how to realize a three-argument logdensity function without the "density interface"-package depending on the "measure theory interface"-package?

How about one of these options:

Give the three-argument logdensity function another name - I guess it can be seen as something different, semantically? Then we could have a separate lightweight "measure theory interface"-package without depending on it.
Not all use cases (e.g. Distributions) that want to define densities want to deal with the depths of measure theory - but I guess applications that do wouldn't mind having the two-argument version around as well? In that case, the "measure theory interface"-package could depend on the "density interface"-package and be officially licensed to define the three-argument version. It's not type piracy if it's officially sanctioned. :-)

@phipsgabler, @cscherrer and @devmotion, what do you think?

devmotion commented 3 years ago

I think the easiest way to converge to a common API would be if MeasureTheory switches to two arguments as well, at least for logdensityof. The transition could be eased by keeping the three argument logic in an (internal?) non-overloaded function, without extending logdensityof (or however the generic log density function would be called). This could be achieved by defining (I take DensityInterface as an example here, but it doesn't matter really) something like

struct MeasureTheoryDensity{M,N}
    mu::M
    nu::N
end

DensityInterface.logdensityof(d::MeasureTheoryDensity, x) = MeasureTheory.logdensity(d.mu, d.nu, x)

I think this option would have the advantage that there is a clean and clear API - the function always takes a density and a point at which it is evaluated. Additionally, there are no type piracy issues as there is no three argument version owned or created in another package.

oschulz commented 3 years ago

I did look into phipsgabler/LogDensity.jl#1 a bit deeper - I like @devmotion's suggestion of combining "entity" and measure in the first argument for measure theory use cases. It solves the type-ownership issue and keeps everything to two arguments. And people not used to thinking in terms of measures (even if they should) can just define logdensityof(a, x) and be done with it. :-)

oschulz commented 3 years ago

Me and @cscherrer had a chat on Zulip - it seems that a lot of the confusion in phipsgabler/LogDensity.jl#1 was indeed due to terminology.

MeasureBase.logdensity means "evaluate the log of the Radon-Nikodym derivative at x" - resp. more general "find the density of this thing and get me it's log-value at x", which does of course not make sense for something that's already a density.

But DensityInterface.logdensityof(d, x) means "get me the log of this (seen as a) density at this point" and DensityInterface.logdensityof(d) means "get me a function that will return the log of this density, given a point". I guess that this is what you were going for as well, @phipsgabler?

Then there's also the way we use terms like prior, likelihood and posterior, and so on - e.g. whether we mean a posterior as a measure, or the posterior density, and so on. From my (as a BAT.jl developer) point of view, users just want to have single (non-normalized) posterior object, and they want to say "sample this", "get me the log value of it (seen as a density, not a measure)", "integrate it, using it's base measure", and so on. Users don't want to have the posterior as a measure and the posterior density as different objects. And they want to be able to specify a likelihood by defining how "get me the density value of it a point x" is to be computed for it. So for statistics/"two-argument" use cases, we want a single function that will return the log-value of a prior, likelihood, posterior, distribution and anything else that can be interpreted to be a kind of a density.

I hope I summarize our conclusions up correctly with: MeasureBase.logdensity and DensityInterface.logdensityof == LogDensity.logdensity have different semantics and should be different functions.

To bridge things and support measure theory, MeasureBase could define

function DensityInterface.logdensityof(RadonNikodymDerivative(mu,nu), x)
    return MeasureBase.logdensity(mu, nu, x)
end

which I think is the essence of @devmotion's proposal? So there would be no problem with type piracy and no need for a three-argument function in DensityInterface. However, that's kinda awkward with function names that are so similar. So:

How about we change the name of logdensityof/logdensity here in DensityInterface, since we now have empirical evidence that it can be misunderstood. I chose logdensityof because LogDensityProblems.jl already exports logdensity (LogDensityProblems.jl is too heavy and opinionated to act as a lightweight abstract interface package, but my proposal create one and "donate" logdensity to it was unsuccessful). But maybe it's just the wrong name, since it really can be misread as "first get the the density of this and then the log of it" (implying it's currently not a density).

I'm open to changing the name, question is, to what? logvalueofdensity is a bit too long for my taste (users will have to type if often, and it'll be all over the kind of code I deal with). logof, on the other hand, is probably too generic for a package that defines an API for densities.

Maybe logofdensity? @phipsgabler and @devmotion I'd value your input. I think we really need a lightweight API package that defines this kind of two-argument function (with a one-argument "curried" variant, I'd plead), and I don't want DensityInterface.jl to end up as an one-man show - I would like to move to it JuliaStats (or JuliaMath) and it should be designed in a way that will make it acceptable as a dependency in Distributions, and Turing/TTL and so on.

devmotion commented 3 years ago

I don't think terminology was an issue in https://github.com/phipsgabler/LogDensity.jl/issues/1. It was clear to me, and I had the impression to everyone else as well (but maybe this impression was wrong). I just left the discussion of this non-critical issue since it was too redundant and I assumed it would be better to spend my limited time on something else.

In general, I don't think MeasureBase.logdensity and DensityInterface.logdensityof are two semantically different functions. In the former, the density is just provided implicitly by two measures but they could equally well just be provided as one argument. Maybe this is not completely clear from the proposal but implementationwise there is no need for a separate internal function, I just suggested this as a quick way of implementing the API without changing anything else in MeasureTheory.

Namewise, I think logdensityof or logdensity (if one is not concerned about LogDensityProblems) is still the best name. None of the alternatives seem better to me.

Maybe the misunderstanding and confusion here arises from the incorrect assumption that density in logdensity(of)(density, x) has to be a density function that just has to be evaluated at x. While this is possible, in practice often it will just be something that represents a density or has a density. Eg., it could be a Distributions.Distribution object that has and represents a density with respect to an implicit base measure, a MeasureTheory.AbstractMeasure which has and represents a density with respect to its basemeasure, or it could be a RadonNikodymDerivative that represents the Radon-Nikodym derivative of two measures (but is not just a readily available function of x).

I think the documentation of DensityInterface is already quite good and explains the interface in a clear way. But maybe this point (density objects) could be clarified even more by adding some concrete examples, e.g., with Distributions or MeasureTheory.

oschulz commented 3 years ago

In general, I don't think MeasureBase.logdensity and DensityInterface.logdensityof are two semantically different functions. In the former, the density is just provided implicitly by two measures but they could equally well just be provided as one argument.

Wouldn't a natural name for MeasureBase.logdensity be something like log_radon_nikodym or so (since it does something quite specific)? I can understand that @cscherrer hesitates to change the API of MeasureBase, I mean just in principle. That's why I thought they should be seen as different functions, with logdensityof(RadonNikodymDerivative(mu,nu), x) == log_radon_nikodym(mu, nu, x) (again, I don't mean to imply that things have to be renamed in MeasureBase).

I fully agree that DensityInterface.logdensityof(d::MeasureTheoryDensity, x) is the cleanest way of handling things for measure theory, also from a dispatch point of view. But again, I can understand that @cscherrer doesn't want to completely rewrite MeasureTheory and MeasureBase.

Namewise, I think logdensityof or logdensity (if one is not concerned about LogDensityProblems) is still the best name. None of the alternatives seem better to me.

Fine with me. @cscherrer, since we were able to clarify the confusion regarding semantics, I hope you wouldn't mind too much if we stick with the function name?

Regarding logdensity() vs. logdensityof(), I have no strong preference. I don't think we need to worry about LogDensityProblems too much, it's not a dependency of mainy packages, and can always choose to support DensityInterface. On the other hand, if we also want a function to access the non-log density value, densityof() would work, but just density() seems too generic, would conflict with lot's of variable names, and Plots.jl already exports density().

Also, using different function names, MeasureBase.logdensity and DensityInterface.logdensityof, may be safest for now?

So, let me propose this (I'm absolutely open to other solutions):

We stick with DensityInterface.logdensityof(density, x) and add DensityInterface.densityof(density, x) with a default implementation densityof(density, x) = exp(logdensityof(density, x)) (@devmotion's proposal in #1, which I really like).
MeasureBase take on DensityInterface as a dependency. It keeps MeasureBase.logdensity, at least for now, and defines DensityInterface.logdensityof(ATypeThatMeasureBaseOwns(mu,nu), x) = MeasureBase.logdensity(mu, nu, x).

Would that be fine with you guys, @devmotion, @cscherrer and @phipsgabler?

If so, I would

Add a test utility test_density_interface (similar to InverseFunctions.test_inverse and ChangesOfVariables.test_with_logabsdet_jacobian)
Sand of a few rough spots (remove the unnecesary __precompile__, add doctests, etc., again similar to what @devmotion and me did in InverseFunctions and ChangesOfVariables).
Tag a new release.
Propose moving DensityInferface.jl to JuliaStats or JuliaMath

Then we could do PR's on Distributions (see #1), Turing/PPL, etc.

cscherrer commented 3 years ago

Thanks @oschulz .

I can appreciate that you want to move quickly with this. As I mentioned in our Zulip chat, I've spent the past year planning and building MeasureTheory, and hoping to get its interface into wider use. So I hope you can understand that these recent discussions have been pretty jarring.

MeasureTheory tries to be both rigorous and very flexible, which requires being very careful about semantics. So my concern is that something that might work well for Distributions might lead to problems when used with MeasureTheory. This risk is compounded by MT having very few contributors relative to Distributions or Turing.

I think it's important that the ecosystem as a whole is as consistent as possible. MT is likely to have the most stringent requirements and the fewest resources, so I hope we can try this out in a MeasureTheory branch before committing to a design with a Distributions or Turing PR.

Given all of that, this sounds fine to me as something to try.

devmotion commented 3 years ago

Would that be fine with you guys

Seems reasonable to me, I like the suggestions :slightly_smiling_face: I also think that densityof (or rather not using and exporting density since it is too generic) is a good argument for using logdensityof but not logdensity.

Propose moving DensityInferface.jl to JuliaStats or JuliaMath

I think it would fit in JuliaStats a bit better but I'm fine with any of these.

oschulz commented 3 years ago

Ok, I'll go ahead with the steps proposed above then.

building MeasureTheory, and hoping to get its interface into wider use.

Sure, that would be great of course! Though I'm sure there will be lot's of "simple" use cases left that won't use MeasureTheory (see below), but that doesn't mean there can't be compatibility.

[...] something that might work well for Distributions might lead to problems when used with MeasureTheory [...] so I hope we can try this out in a MeasureTheory branch before committing to a design with a Distributions or Turing PR

I do get your point about measure theory becoming a more widespread interface, but a kind of two-argument log-of-density function is very widely used across the statistics ecosystem, and has been for years, I don't think that will change fundamentally (I don't think users would want it to). This is also what frameworks like BAT.jl have been developed around for several years now, and from what I understand, Turing/PPL has the same needs. We've had this kind of two-argument function in various places for a a long time now, we just haven't had a common name and interface for it, limiting interoperability - hence the need for DensityInterface.jl (focused on densities, not measures).

We won't get to a point where all use cases use a three-argument Radon-Nikodym derivative approach, many users and use cases just aren't as strict about semantics as MeasureTheory. I think it's statistics/inference and similar use cases primarily need a two-argument "give-me-the-log-of-this-density" function.

I don't think there's any risk to measure theory, though: Distributions & friends supporting a two-argument logdensityof doesn't preclude support for a three-argument "evaluate the log of the Radon-Nikodym derivative at x" function after all (provided it comes with a very lightweight dependency, I guess). And I believe we can have good interoperability (e.g. via a RadonNikodymDerivative as the first argument of logdensityof, as proposed).

So now that we have consensus on the name of the two-argument function (logdensityof), I think we should keep the momentum and can ahead without risking impacting MeasureTheory. But it'll still take a few days (at least) to get everything in motion, so maybe you can do the tests you proposed on a branch of MeasureTheory in the mean time, just to make sure that all really will be fine on your end? DensityInterface is registered and the improvements I listed above won't change logdensityof or it's behavior at all, so you can go ahead without any looming API breaks.

cscherrer commented 3 years ago

I do get your point about measure theory becoming a more widespread interface, but a kind of two-argument log-of-density function is very widely used across the statistics ecosystem, and has been for years, I don't think that will change fundamentally (I don't think users would want it to).

Are you talking about logpdf? I agree this should be left alone.

We won't get to a point where all use cases use a three-argument Radon-Nikodym derivative approach, many users and use cases just aren't as strict about semantics as MeasureTheory.

I agree! It was never the intent to get all users to rely explicitly on the three-argument approach. But "density" is always a relative thing, whether or not users care to think about that. It's more about adding computational infrastructure that can help without interfering, and that can be used explicitly when people want to.

I don't think there's any risk to measure theory, though

The risk I see is just the potential for confusion. Two very similar exported names that mean almost-but-not-quite the same thing. It's hard to see what long-term consequences this might lead to, but it's easy to imagine it being at least an ongoing point of confusion and common errors.

oschulz commented 3 years ago

Are you talking about logpdf? I agree this should be left alone.

Yes - and DensityInterface.logdensityof will just be a fairly mild generalization of it (to support non-normalized entities, etc.).

Two very similar exported names that mean almost-but-not-quite the same thing.

True - but for DensityInterface.logdensityof we haven't manage to come up with a better one, and we have tried. :-)

But what about renaming MeasureBase.logdensity to something like log_radon_nikodym or log_rd_derivative or so? I know, I keep coming back with that, and I certainly don't want to be appear pushy - but it might be more intuitive to users than logdensity. A simple rename shouldn't risk breaking anything in the measure theory packages, right?

oschulz commented 3 years ago

I think it would fit in JuliaStats a bit better but I'm fine with any of these.

JuliaStats sounds good to me, let's go with that, then.

oschulz commented 3 years ago

I also think that densityof (or rather not using and exporting density since it is too generic) is a good argument for using logdensityof but not logdensity

@devmotion I've had one more idea of a possible name: How would you feel about using densityval and logdensityval (instead of densityof and logdensityof). It would make it clearer that we're asking for the (log-)value of a density, instead of asking for a density. It might be "smoother" then writing definitions like logdensityval(d::MyLikelihood) = .... I'm fine staying with logdensityof though if you prefer that.

devmotion commented 3 years ago

Hmm I think I prefer logdensityof. IMO it is less clear, due to the abbreviation and since it seems sufficiently clear from the definition that it returns a value and hence feels redundant (additionally, many/most functions return values but don't indicate it explicitly in the function name). Moreover, logdensityval seems a slightly incorrect/surprising name for the curried version since there not a scalar value but a lo density function is returned.

I imagine that in many use cases logdensityof is called with objects that are not explicitly defined as densities but only have some density that is evaluated. Eg, in Distributions Distributions are not defined as densities but as probability distributions or in AbstractPPL prob models are not defined as densities. In all these cases logdensityof seems to describe the purpose of the function quite accurately (logdensity as well).

So I think logdensityof or logdensity would be better. Maybe in the end it's not too bad that density is a generic name and has to be qualified when exported by other packages? The package is called DensityInterface so it seems reasonable to define a density function 🙂 There are name clashes already without DensityInterface.density (https://github.com/JuliaPlots/Makie.jl/issues/925) which maybe can/should be solved by using different names in these packages. And maybe some packages that define a density could even implement DensityInterface.

oschulz commented 3 years ago

Ah, yes, you're right, logdensityval isn't so nice with the "curried" version. Ok, logdensityof it is.

Maybe we can add a few word regarding relation to MeasureBase/MeasureTheory in the docs? We can state that in cases where the first argument is not already a density, the semantics of [log]densityof(something, x) is to yield the density with respect to an implied base measure. @cscherrer , if I draft something, could you take a look at it (I'm not an expert on measure theory) to make sure it's mathematically sound?

cscherrer commented 3 years ago

Sounds good, thanks @oschulz

phipsgabler commented 3 years ago

Sorry for being late, I'm recovering from a bad cold.

It appreas you guys have already come up with a working compromise -- great! I'm going to read up on the current state of the discussions.

And I'll get rid of LogDensity.jl then, as it is now superfluous.

oschulz commented 3 years ago

Sorry for being late, I'm recovering from a bad cold.

My very best wishes for a speedy recovery!

oschulz commented 3 years ago

Closed via #4.

JuliaMath / DensityInterface.jl

Cf. LogDensity.jl #3