JuliaMath / DensityInterface.jl

Interface for mathematical/statistical densities in Julia
Other
12 stars 3 forks source link

Improve logfuncdensity behavior and add isdensity #9

Closed oschulz closed 2 years ago

oschulz commented 2 years ago

Updated

Replaces hasdensity by DensityKind (#9)

Also changes behavior of logfuncdensity to always return a density and adds funcdensity.

codecov[bot] commented 2 years ago

Codecov Report

Merging #9 (e28096a) into master (ccade11) will not change coverage. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##            master        #9   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            2         2           
  Lines           37        59   +22     
=========================================
+ Hits            37        59   +22     
Impacted Files Coverage Δ
src/interface.jl 100.00% <100.00%> (ø)
src/interface_test.jl 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update ccade11...e28096a. Read the comment docs.

oschulz commented 2 years ago

CC @devmotion , @cscherrer , @mschauer, @phipsgabler

oschulz commented 2 years ago

Closes #5 (well, eventually ...)

cscherrer commented 2 years ago

Great! Just a note, we'll need a similar PR for MeasureTheory in addition to the one for MeasureBase, to be sure this can extend naturally and all tests can pass. I'll try this out with Soss as well, that can help us check for a wider range of use cases.

devmotion commented 2 years ago

I know you have discussed the return type of hasdensity already, but what about something like IteratorEltype to cover three cases?

I like this since it would keep the API simpler and cleaner.

oschulz commented 2 years ago

NoDensity <: DensityType, HasDensity <: DensityType, IsDensity <: DensityType, IsMeasure <: DensityType

I don't like that specific version so much, because that way IsMeasure doesn't impy HasDensity, which it should. But we could use abstract types and have a hierarchy among them. This might be the right approach, compared to the Bool functions, now that we need to differentiate between different kinds of things that have a density. There could be other things that can be said to have a density but aren't densities or measures themselves. In Physics, for example, one might deal with objects that have a (e.g. mass) density, and a physical object is neither a density nor a measure itself.

Using types would make this extensible by other packages. We should think a bit about possible implications first, though.

cscherrer commented 2 years ago

There could be other things that can be said to have a density but aren't densities or measures themselves. In Physics, for example, one might deal with objects that have a (e.g. mass) density, and a physical object is neither a density nor a measure itself.

I don't think this is really a counterexample. mass density is a local ratio of mass to volume, both of which are measures. The physical object isn't a measure, but it has these measures associated with it, and this "density" is a proper density in the measure-theoretic sense.

oschulz commented 2 years ago

The physical object isn't a measure, but it has these measures associated with it, and this "density" is a proper density in the measure-theoretic sense.

Yes, that's what I mean - we'd say the object has a proper density, so hasdensity(object) == true, but neither isdensity(object) == true nor ismeasure(object) == true would make sense.

mschauer commented 2 years ago

With a Volume and a CountingMeasure and having a fallback, perhaps DynamicDensity living in DensityInterface we should encourage to define basemeasure if that is known. That's just useful information, which might allow me to give you a meaning error message why your density doesn't work for my sampler... and its known in most cases except dynamic probabilistic programming, so define

basemeasure(::MyPPLProgram) = DynamicDensity() 

be done with it. One can wonder if one of those makes a sensible fallback. It would be useful to give density function wrappers which do this for you: say logfuncpdf and logfuncpmf

oschulz commented 2 years ago

With a Volume and a CountingMeasure

I would call it LebesgueMeasure or VolumeMeasure, not Volume - Volume is far too generic to be exported.

mschauer commented 2 years ago

There is a second thing I would like to add (sorry for upsetting the apple cart), but I think a signature basemeasure(d, x) is preferable, because it is conceivable that a package using the density interface has a simple rule where they can infer what type of density by just looking at say size(x) or typeof(x). You'd be free to ignore x there of course.

oschulz commented 2 years ago

There is a second thing I would like to add (sorry for upsetting the apple cart), but I think a signature basemeasure(d, x) is preferable

I don't think that would be upsetting at all. :-)

In fact, it may be quite natural, from a software design point of view: We say that densityof(object, x) will return the density value at x. But in a multiple dispatch language, what densityof does can of course be different for different types of x. So it would seem natural that the base measure can depend on on x - in theory even the actual value, not just the type - too. (Though in almost all cases it should just depend on the type of x, not the value).

Maybe this could also help with the ProductOver / IIDDensity construct, semantically? If the base density can depend on the argument, the PDF of ProductOver could be seen to be normalized. That way it could, maybe, be an actual product distribution instead of a density with complex semantics. Here, maybe we would want the base measure depend on both the type and size(x).

oschulz commented 2 years ago

I have a question: Are primitive measures like VolumeMeasure singletons, or do they carry information like dimensionality? So is it VolumeMeasure() or VolumeMeasure(ndims).

I have a feeling that this is an important design decision: I imagine that it would be useful in many cases, practically, to be able to predict things like array sizes bases on a base measure, and so on. On the other hand, do we always have dimensionality information available when we want/need to create a measure, esp. a primitive one? If we go with @mschauer's suggestion of having basemeasure(o, x), I think we probably do.

I would tend towards VolumeMeasure(ndims).

devmotion commented 2 years ago

I'm sorry to crash the party here but I have serious doubts about integrating basemeasure and measures in DensityInterface. I think @phipsgabler's suggestion of a more fine-tuned trait system for densities can be useful, in particular in packages that work with explicit base measures such as MeasureTheory.

The measure-theoretical parts, however, seem to me like a reduced and more limited version of MeasureBase. I think it would be cleaner and the interface and the amount of measures would be less limited if packages that want to define and work with explicit base measures just implement the MeasureBase interface. Why adding a reduced version of something to DensityInterface that already exists in another interface package? I would view MeasureBase basically as the measuretheoretical extension of DensityInterface with explicit base measures.

I tried to explain my concerns also in this thread.

mschauer commented 2 years ago

basemeasure(o, x) might be at least sufficient to deal with PPL dynamism like Gen.jl where each random variable has a namend address/site and x would be a corresponding named tuple (correct me if I am wrong about Gen here)

cscherrer commented 2 years ago

Yes, that's what I mean - we'd say the object has a proper density, so hasdensity(object) == true, but neither isdensity(object) == true nor ismeasure(object) == true would make sense.

Isn't this just a matter of abbreviated language? The object may also have a charge or heat density. At some point, don't you end up saying "mass per unit volume"?


Ok, I have a proposal:

  1. A new package, MeasureInterface, depending on DensityInterface, and exporting basemeasure and some other very minimal primitives
  2. A new DistributionMeasures, depending on MeasureInterface, DensityInterface, and Distributions. This can mostly focus on defining basemeasure for various Distributions

Then

An advantage of this is that we can easily make sure Distributions has basemeasure covered (we can have tests for this), and users who want to can easily add basemeasure methods without the requirement to depend on MeasureTheory. For those packages (assuming basemeasure is implemented well) MeasureTheory should "just work"

devmotion commented 2 years ago

Ok, I have a proposal:

What about:

Alternatively, if MeasureBase is not lightweight enough one could extract a more lightweight MeasuresInterface package. But I assume it might not be worth it since, as far as I understand, MeasureBase is already a reduced base package and hence I fear breaking it up might lead to a somewhat incomplete package.

The main principle would be similar to ChainRules: Packages and users that want to work with base measures etc. have to depend on MeasureBase and should define basemeasure etc.

oschulz commented 2 years ago

A new package, MeasureInterface, depending on DensityInterface, and exporting basemeasure and some other very minimal primitives

Sounds good to me. Should we maybe prototype that stuff here in this PR first, though, so we have all the lightweight stuff in context, and then split up? I expect the line where to split will be obvious when the design is done.

A new DistributionMeasures, depending on MeasureInterface, DensityInterface, and Distributions. > This can mostly focus on defining basemeasure for various Distributions

I would actually favour having that in Distributions. As long as it's not a lot of code, Distributions can take it, it's not exactly a small package. And having it in Distributions will encourage third-party dist developers to define their base measure. Also, when new dists are added to Distributions, we won't always need to do a separate PR to DistributionMeasures if it was integrated.

devmotion commented 2 years ago

The problem that I see with @cscherrer's proposal is the amount of type piracy in e.g. DistributionsMeasures it would create and the overall dependency structure. I would prefer if one just has to depend on MeasureBase (or a more lightweight interface package).

oschulz commented 2 years ago

What about ... Switching to @phipsgabler's traits

I agree, it's more flexible that way. I'll add it to this PR, then we can take a look at it in practice.

oschulz commented 2 years ago

The problem that I see with @cscherrer's proposal is the amount of type piracy in e.g. DistributionsMeasures it would create and the overall dependency structure. I would prefer if one just has to depend on MeasureBase (or a more lightweight interface package).

I agree - I guess that more lightweight package (MeasureBase is too heavy I think) would be @cscherrer's proposed MeasureInterface (I like the name). Distributions would depend on it and dist-specific measures (like for Dirichlet) would live in Distributions, right next to "their" distribution.

cscherrer commented 2 years ago

I like the general ideaof @phipsgabler 's suggestion a lot, we just need some time to think through how this will interface with MT, especially in terms of basemeasure

* Keeping `basemeasure` and measures in MeasureBase

* Moving Distributions-specific definitions of `basemeasure` etc. from MeasureBase/Theory to Distributions

So Distributions would depend on MeasureBase?

Alternatively, if MeasureBase is not lightweight enough one could extract a more lightweight MeasuresInterface package. But I assume it might not be worth since, as far as I understand, MeasureBase is already a reduced base package and hence I fear breaking it up might lead to a somewhat incomplete package.

Our compat section is currently

[compat]
ConcreteStructs = "0.2"
ConstructionBase = "1.3"
FillArrays = "0.12"
KeywordCalls = "0.2"
LogExpFunctions = "0.3"
MLStyle = "0.4"
MappedArrays = "0.4"
PrettyPrinting = "0.3"
Tricks = "0.1"
julia = "1.3"

We could probably pare this down some more:

The main principle would be similar to ChainRules: Packages and users that want to work with base measures etc. have to depend on MeasureBase and should define basemeasure etc.

Right, that was the original intent of MeasureBase.

The problem that I see with @cscherrer's proposal is the amount of type piracy in e.g. DistributionsMeasures it would create and the overall dependency structure. I would prefer if one just has to depend on MeasureBase (or a more lightweight interface package).

Great catch, thanks

I agree - I guess that more lightweight package (MeasureBase is too heavy I think) would be @cscherrer's proposed MeasureInterface (I like the name). Distributions would depend on it and dist-specific measures (like for Dirichlet) would live in Distributions, right next to "their" distribution.

I'm a little confused now. If Distributions will depend on MeasureInterface, how is that different than just adding basemeasure to DensityInterface?

oschulz commented 2 years ago

I agree with @devmotion - in the end a lightweight measure-oriented package, extending DensityInterface will be more flexible than adding too much measure theory to DensityInterface itself.

I like @cscherrer 's proposed name MeasureInterface. I would see it living in JuliaMath, eventually.

For convenience though (so we don't have to hop between too many different PRs) and to see how everything with mesh, it might be easiest to prototype this lightweight measures interface within this PR initially (and mark parts of it with a comment) to see how it all fits together.

devmotion commented 2 years ago

If Distributions will depend on MeasureInterface, how is that different than just adding basemeasure to DensityInterface?

Yes, I would suggest that Distributions depends on MeasureBase or, if it is not deemed lightweight enough, on some MeasureInterface. But as I said I would actually think MeasureBase would be preferable. The main reason for this suggestion is that my impression of this draft is that it will introduce a reduced half-baked version of MeasureBase in DensityInterface. However,

So I think it would be cleaner to demand that packages that want to define explicit base measures and hence want their densities to be usable in the MeasureTheory context have to depend on MeasureBase (or MeasureInterface). And that we only make sure here that DensityInterface plays nicely with MeasureBase.

oschulz commented 2 years ago

@devmotion: Yes, I would suggest that Distributions depends on MeasureBase or, if it is not deemed lightweight enough, on some MeasureInterface.

MeasureBase would add ConcreteStructs, ConstructionBase, KeywordCalls, MLStyle, MappedArrays, PrettyPrinting and Tricks to the transitive deps of Distributions, in addition to itself. Also (run in the same session):

julia> @time using Distributions
  0.500227 seconds (1.35 M allocations: 87.835 MiB, 20.01% compilation time)

julia> @time using MeasureBase
  0.226225 seconds (418.94 k allocations: 26.701 MiB)

So we'd add up to 45% to the load time of Distributions. And I expect Distributions will need only the existance of basemeasure and a very few primitive measure types.

If we don't want a separate MeasureInterface - would it be possible to make MeasureBase a bit more lightweight?

cscherrer commented 2 years ago

I addressed that here @oschulz : https://github.com/JuliaMath/DensityInterface.jl/pull/9#issuecomment-965762958

oschulz commented 2 years ago

I addressed that here @oschulz : #9 (comment)

Oh right! Sounds good!

oschulz commented 2 years ago

You're right, it would be nicer to have MeasureTheory and just a single lightweight interface package for it, instead of having both MeasureBase and MeasureBaseLight (a.k.a MeasureInterface).

devmotion commented 2 years ago

Regarding MeasureBase dependencies: Distributions already depends on FillArrays and LogExpFunctions, so these dependencies are completely fine for Distributions. I don't know what the heaviest dependencies are; I haven't used MLStyle (I think) but I heard that it is better than MacroTools.

cscherrer commented 2 years ago

I don't know what the heaviest dependencies are; I haven't used MLStyle (I think) but I heard that it is better than MacroTools.

MLStyle is very nice, but it does have more compile-time cost:

julia> @time using MacroTools
  0.019664 seconds (36.78 k allocations: 3.666 MiB, 29.87% compilation time)

# Fresh instance

julia> @time using MLStyle
  0.178419 seconds (258.29 k allocations: 18.214 MiB, 4.36% compilation time)

(edited, I originally mistakenly posted the time including initial precompilation)

mschauer commented 2 years ago

Is the culprit MLStyle itself or the DataFrames dependency it has?

devmotion commented 2 years ago

It doesn't depend on DataFrames, it's only a test dependency. It seems MLStyle does not depend on any packages: https://github.com/thautwarm/MLStyle.jl/blob/master/Project.toml

oschulz commented 2 years ago

It doesn't depend on DataFrames, it's only a test dependency.

Whew - depending on DataFrames would not be fun ... :-)

cscherrer commented 2 years ago

Ok, so we use MLStyle for macros, specifically

  1. To declare a new parameterized measure, e.g. @parameterized Normal(μ,σ)
  2. To build a new "Half" measure, e.g. @half Normal

But I think these can be moved entirely to MeasureTheory.

We'll also need to be sure we're consistent with Distributions for things like *(::Real, ::Distribution) (see https://github.com/cscherrer/MeasureTheory.jl/issues/170)

Currently the biggest concerns are

  1. Parts of the MeasureBase implementation are not yet stabilized, and
  2. Lots of these tools are intended to work with Distributions, with those implementations currently in MeasureTheory (type piracy, yes, but the only alternative was having a Distributions dependency)

My intent with MeasureBase was to collect tools that are central to working with measures but don't require many external dependencies. It's definitely not ultra-lightweight. There are tools here for

All of these are connected with "smart constructors", so composition of measures in different ways tries to always give to a reasonable representation.

This is... kind of a lot. If Distributions is to depend on this, we need to be sure

  1. It's stable (or move unstable components to MeasureTheory), and
  2. Distributions will be usable with all appropriate combinators
devmotion commented 2 years ago

or move unstable components to MeasureTheory

Naively I would assume this would be easiest way forward right now but probably it would require some effort to separate and move things to MeasureTheory without breaking MeasureTheory and MeasureBase.

For some things in your list there exist also, probably more ad-hoc, alternatives in Distributions. E.g., truncated distributions, product distributions (hopefully more general ones at the end of this week :crossed_fingers:), mixture distributions, affine transformations, and convolutions of random variables are already supported by Distributions, so it might be easier to keep things separate here for now.

oschulz commented 2 years ago

@devmotion , @cscherrer , @mschauer, @phipsgabler, I feel there's a consensus evolving here, so I have updated this PR to what I hope reflects the state of the discussion:

Instead of d, I use object as an argument for the "thing that is or has an density" now, I hope it carries less of an "it's a density" implication.

I chose densitykind because something like densitytype(object) or similar might be understood to return the actual type of the density of object, instead of providing information regarding the relationship between object and density. But the function name is of course up for discussion, as are the names of the types it can return.

Replacing hasdensity by densitykind will be a breaking change, but DensityInterface is young and Distributions could support both the old and the new version quite easily via @static if isdefined(DensityInterface, :densitykind) ... else ... end in two places or so.

cscherrer commented 2 years ago

or move unstable components to MeasureTheory

Naively I would assume this would be easiest way forward right now but probably it would require some effort to separate and move things to MeasureTheory without breaking MeasureTheory and MeasureBase.

Maybe harder than not breaking (it will be a breaking release anyway, at least for MB) is determining what should go where, in a way that we'd stick with for a long time. We could just change things whenever, but that makes it hard for anything downstream.

For some things in your list there exist also, probably more ad-hoc, alternatives in Distributions. E.g., truncated distributions, product distributions (hopefully more general ones at the end of this week crossed_fingers), mixture distributions, affine transformations, and convolutions of random variables are already supported by Distributions, so it might be easier to keep things separate here for now.

Maybe more ad-hoc, but much more mature, well-documented, and well-tested. In some cases things are just fundamentally incompatible (Normal() for us is a different type than Normal(0.0, 1.0)), but outside of that I hope we can eventually find a path to get some of this into Distributions, and to make it generally painless for users to mix and match.

oschulz commented 2 years ago

Apologies up front, @devmotion , @cscherrer , @mschauer and @phipsgabler, this is a long post. I do hope it will help clear the path forward, though - bear with me ...

Semantics of MeasureBase.[log]density and MeasureBase.basemeasure

Summary: It turns out (@cscherrer and @mschauer, please correct me if I'm wrong) that DensityInterface.[log]densityof(m, x) does not have the same semantics as the current MeasureBase.density(m, x) at all. Likewise, MeasureBase.basemeasure does not have the semantics we mean when we say "base measure" in the DensityInteface docs.

As an example, for Bayesian inference DensityInterface users will (I think) expect this behavior from DensityInterface.logdensityof:

posterior = ∫exp(log_likelihood, prior)

DensityInterface.logdensityof(posterior, x) ==
    log_likelihood(x) + DensityInterface.logdensityof(prior, x)

But the current behavior of MeasureBase.logdensity is

posterior = ∫exp(log_likelihood, prior)

MeasureBase.logdensity(posterior, x) == log_likelihood(x)

MeasureBase also offers prior ⊙ likelihood which (if I understand correctly) generates an equivalent density to ∫(likelihood, prior), but will include the prior value in the result of MeasureBase.logdensity.

In general, MeasureBase.logdensity(m, x) can return different values for equivalent and equally normalized measures, e.g.

MeasureBase.logdensity(MeasureTheory.Normal(), x) != MeasureBase.logdensity(Distributions.Normal(), x)

That's just because MeasureBase.basemeasure does not mean "natural" or "intuitive" base measure but rather (I would say) "closest" base measure:

basemeasure(∫exp(log_likelihood, prior)) == prior
basemeasure(MeasureTheory.Normal()) == 0.398942 * Lebesgue(ℝ)
basemeasure(Distributions.Normal()) == Lebesgue(ℝ)

Within MeasureTheory, this is of course perfectly fine. Users should use MeasureBase.logdensity(m, base, x) to clearly state in respect to what base measure they want a density.

For DensityInterface.[log]densityof though, I do believe that we want the term "base measure" to mean "intuitive/natural base measure", and we do really want [log]densityof to return the same value for at the same x for equivalent and equally normalized measures. Anything else could lead to silently incorrect results in many applications we're targeting with this API - especially since we don't require users of DensityInterface to be aware of measure theory concepts.

So the DensityInterface interpretation of "base measure" (in the sense of "what [log]densityof uses implicitly") is not MeasureBase.basemeasure. It is similar (identical?) to MeasureBase.rootmeasure instead.

Equivalent of DensityInterface.[log]densityof in MeasureBase

There is, from what I understand, and exact equivalent to what we want DensityInterface.[log]densityof to mean/do in MeasureBase: It's actually [log]pdf. In contrast to the name it does return non-normalized density values for non-normalized measures. If I understood @cscherrer correctly, they're not perfectly happy with [log]pdf as a name because of that. One reason why we created DensityInterface.[log]densityof is just that - to have something more general than [log]pdf, something that won't be be weird to use for non-normalized cases.

Proposal

old proposal

Note: Updated proposal below

devmotion commented 2 years ago

For DensityInterface.[log]densityof though, I do believe that we want the term "base measure" to mean "intuitive/natural base measure", and we do really want [log]densityof to return the same value for at the same x for equivalent and equally normalized measures. Anything else could lead to silently incorrect results in many applications we're targeting with this API - especially since we don't require users of DensityInterface to be aware of measure theory concepts. So the DensityInterface interpretation of "base measure" (in the sense of "what [log]densityof uses implicitly") is not MeasureBase.basemeasure. It is similar (identical?) to MeasureBase.rootmeasure instead. We change the phrasing "base measure" to "root measure" or something similar in the LogDensity docs.

I don't agree with these points. I think that often (such as in Distributions) the implicit base measure of densities that don't really deal with or think about measure theory will be a "natural" base measure. But this is not an assumption of the DensityInterface API and not something they have to do.

Base measure is the common term for the reference measure in the Radon-Nikodym derivative. Of course, there are different choices possible and e.g. for computational efficiency MeasureTheory chose base measures such that constant terms are removed from the calculation of the density. But it is much more common to use the Lebesgue and counting measure as base measure.

So I think the name "base measure" is the correct term here and I think "root measure" will be quite confusing since AFAIK it is not commonly used in this context (at least I have only heard the term base measure in the context of Radon-Nikodym derivatives).

oschulz commented 2 years ago

I don't agree with these points.

But do you agree that for

posterior = ∫exp(log_likelihood, prior)

we want

DensityInterface.logdensityof(posterior, x) ==
    log_likelihood(x) + DensityInterface.logdensityof(prior, x)

and not

DensityInterface.logdensityof(posterior, x) ==
    log_likelihood(x)

and that MeasureBase should implement [log]densityof] for measures the way it currently implements [log]pdf?

So I think the name "base measure" is the correct term here

Ah, I get it - "base measure" in our docs means "what we use in the implicit RD-derivative". It doesn't mean base as in "basic".

Should we change wording like "implicit base measure" to "implicit natural base measure" or so?

oschulz commented 2 years ago

Ok, here's an updated proposal:

1)

2) As a first step, MeasureBase simply defines

    DensityInterface.logdensityof(m::AbstractMeasure, x) = logpdf(m, x)
 Later on, MeasureTheory could replace `[log]pdf` with `DensityInterface.[log]densityof`. A problem is that `[log]densityof` and `[log]density` are very similar (n.b. there's also the awkward conflict with `Plots.density`). It's of course the prerogative of the MeasureTheory team to decide on a possible renaming - technically, no name changes are required. The measure theory packages could also use an internal alias for `DensityInterface.[log]densityof` to keep the code readable and to avoid confusion between `[log]density` and `[log]densityof`.

3) We try to get MeasureBase light enough for it to become an acceptable dependency for Distributions. Distributions take it on as a dependency and defines MeasureBase.basemeasure for all distributions, as well as additional base measures where required (e.g. for Dirichlet).

cscherrer commented 2 years ago

Thanks @oschulz . I'd like to restate some things, to be sure I understand. We'll also need to make this precise, even if casual users aren't expected to understand this precision.

Calls to logdensityof are implicitly with respect to some base measure. To make this a little more concrete, let's say we're talking about a density f, a measure mu (which may happen to be a Distribution, or something else), and base measure alpha. Then it seems

  1. There is a class of measures that's "preferred" in some way, and includes Lebesgue measure and counting measure. It probably also includes any "rigid pushforward" embedding Lebesgue measure into a new space. We should characterize this class better.
  2. If f has a known normalization, then it is normalized, and alpha is required to be one of these "preferred" measures
  3. If f is known to be normalizable but its normalization is not known, it can still be treated as a distribution. This requires alpha may possibly not be "preferred" (there's no way around this). It seems like in this case you want something like "alpha is a preferred measure, weighted by some constant"

Hmm, I think I see a way to make things more consistent between us. logdensityof can take a keyword argument indicating whether it is taken with respect to (a constant multiple of) one of these preferred measures. This can default to true, and using false can be considered "expert mode".

devmotion commented 2 years ago

I am still wondering: isn't this a discussion that belongs to MeasureBase and in what way is it relevant for DensityInterface?

My view is:

So in my opinion the next action points would be:

cscherrer commented 2 years ago

I think there are ways to hide complexity, to make things easier for new or casual users. But ignoring complexity won't work.

Consider logdensityof methods defined in MeasureTheory. For example, our Normal() has a different base measure than that of Distributions.Normal(). Because of this, logdensityof will currently return different results. From discussions with @oschulz , there's a sort of contract about this, even if end users can consider this to be implicit.

If the only rule is that there are no rules, I don't see any way we can get things to work together. We need to get to a logdensityof

The alternative to this is that we use logdensityof only superficially, as a way of connecting to underspecified measures there's no way to reason about. But this will limit the ecosystem as a whole, and make interoperability difficult.

oschulz commented 2 years ago

@devmotion DensityInterface does not care about with respect to which base measure densities are defined

In particular, it is neither assumed nor enforced that the base measures are e.g. Lebesgue or counting measures (even though in practice most likely usually these will be the base measures just because they are the most common base measures)

Sure - though I think users will thanks us if logdensityof(MeasureTheory.Normal(), x) == logdensityof(Distributions.Normal(), x). It is of course up to MeasureTheory to define the default base measure. My feeling though is that users would profit most if [log]densityof would behave like the (not ideally named) [log]pdf in MeasureBase in respect to choice of base measure.

So in my opinion the next action points would be: Change hasdensity to an extendable trait system (but without measure related types)

I agree - that should be the current state of this PR right now. In my proposal above I just wanted to include possible steps for related packages, to have things in context.

Add a measure theoretical extension of the DensityInterface trait system to MeasureBase

Yes - as a first step, it could simply be

abstract type IsMeasure <: IsOrHasDensity end
export IsMeasure

@inline densitykind(::AbstractMeasure) = IsMeasure

DensityInterface.logdensityof(m::AbstractMeasure, x) = logpdf(m, x)
DensityInterface.densityof(m::AbstractMeasure, x) = pdf(m, x) # If pdf(::AbstractMeasure, x) is defined.

That would already give us a lot of interoperability. I think I could make BAT.jl support measures as priors and as outcomes for model functions based on just that.

Remove discussion of measure theoretical concepts in DensityInterface (is there any?) and instead refer to the documentation of MeasureBase (is there any?)

No more than before. We have one statement about Radon–Nikodym derivative in respect of a base measure and a footnote linking to MeasureTheory in the main docs. And similar statements in the docstrings of [log]densityof. But we had those already, and I think they are helpful. Though the parts in the docstrings could be moved to MeasureBase - maybe the part in the main docs, too.

devmotion commented 2 years ago

I don't want to hide or ignore complexity. I just suggest that MeasureBase should be the measure-theoretical extension of DensityInterface.

As I tried to outline above, in MeasureBase you would define additional traits such as ismeasure (makes me even wonder if we should just keep hasdensity) and additional functions such as basemeasure etc. These would allow to query base measures etc. for all densities that implement the MeasureBase interface/extension.

If one just implements the DensityInterface interface, one does not have to state any base measures. This is on purpose and works just fine if you don't want to reason about measure theoretical concepts and if you use the densities appropriately (I don't see a problem with the latter, you can always break stuff or use things incorrectly, so I think it's fine to make it the users' responsibility).

If one implements the MeasureBase interface as well, packages that care about base measures etc. can query them, they will know it is a measure and whatever else is part of the interface. And then if you work with user-provided inputs you could check or dispatch on these additional properties to ensure that basemeasure etc. are defined appropriately.

I really think we should keep these two levels separate. And I don't see a problem with this approach. It is clear that MeasureTheory needs more information, base measures, etc. and hence can't just work with arbitrary densities. But exactly this additional information would be provided by densities/distributions/etc. that implement the MeasureBase interface.

oschulz commented 2 years ago

As I tried to outline above, in MeasureBase you would define additional traits such as ismeasure (makes me even wonder if we should just keep hasdensity)

hasdensity will already be history with this PR in it's current state.

oschulz commented 2 years ago

@cscherrer Thanks @oschulz . I'd like to restate some things, to be sure I understand. We'll also need to make this precise, even if casual users aren't expected to understand this precision.

Calls to logdensityof are implicitly with respect to some base measure.

Yes, except if the object of the call is already a density (an object that represents a plain density function but is not fixed in regard to log/lon-log value).

To make this a little more concrete, let's say we're talking about a density f, a measure mu (which may happen to be a Distribution, or something else), and base measure alpha. Then it seems

Yes. It's up to MeasureBase to define what "preferred" means. I would argue though that it should be intuitive to the average user. And that it should result in

posterior = ∫exp(log_likelihood, prior)

DensityInterface.logdensityof(posterior, x) ==
    log_likelihood(x) + DensityInterface.logdensityof(prior, x)

Otherwise, it will be hard to support measures as priors in packages like BAT (because BAT would then need to use a different function to access the log-density value of a measure, and the whole point is to be able to use logdensityof throughout).

If f has a known normalization, then it is normalized, and alpha is required to be one of these "preferred" measures

I would say no! If f is not normalized, I would very much want [log]densityof to return the non-normalized value, independent from whether a normalization factor is known or not.

Hmm, I think I see a way to make things more consistent between us. logdensityof can take a keyword argument indicating whether it is taken with respect to (a constant multiple of) one of these preferred measures [...] "expert mode"

I don't think it should use a keyword. I would rather say that if users need more control, they should use the three-argument MeasureBase.density. That would be the "expert mode". :-)

cscherrer commented 2 years ago

Thanks @devmotion , I'm still trying to figure out how we're talking past each other. I think this statement shows a different in our views:

It is clear that MeasureTheory needs more information, base measures, etc. and hence can't just work with arbitrary densities.

The understanding I've come to since starting work on MeasureTheory is that "arbitrary densities" doesn't actually mean anything. A density always has a base measure, whether or not you bring it up.

Maybe it helps to see this:

logdensity(d::Normal{()} , x) = - x^2 / 2 

That's our implementation. We call this logdensity because that's exactly what it is. But it clearly conflicts with the expectations of DensityInterface. We need a way out of this, and I see only two options:

  1. Make connection to LogDensityInterface.logdensityof very superficial, so we don't lose our current expressiveness
  2. Add a way to indicate we're in expert mode

I'd much rather have a way to strongly connect with this package. Maybe there are other options, but I'm not seeing them yet.

oschulz commented 2 years ago

@cscherrer Maybe it helps to see this: logdensity(d::Normal{()} , x) = - x^2 / 2 That's our implementation.

But that implementation goes together with

julia> basemeasure(MeasureTheory.Normal())
0.398942 * Lebesgue(ℝ)

which is why I would say that MeasureTheory.Normal() is equivalent to Distributions.Normal(), and both have norm one.

Why is why

julia> density(MeasureTheory.Normal(), 0) ≈ density(Distributions.Normal(), 0)
false

but

julia> pdf(MeasureTheory.Normal(), 0) ≈ pdf(Distributions.Normal(), 0)
true

julia> pdf(2 * MeasureTheory.Normal(), 0) ≈ 2 * pdf(Distributions.Normal(), 0)
true

I believe the non-expert user we target with DensityInterface will be suprised if the logdensityof result differs for two equivalent measures that have the same norm. And so I think the choice of base measure for logdensityof for AbstractMeasure should be the same as the one logpdf uses.