Refactors of f-divergences

Current definition

def fDiv (f : ℝ → ℝ) (μ ν : Measure α) : EReal :=
  if ¬ Integrable (fun x ↦ f ((∂μ/∂ν) x).toReal) ν then ⊤
  else ∫ x, f ((∂μ/∂ν) x).toReal ∂ν + derivAtTop f * μ.singularPart ν .univ

Issues

The integral uses the value of f at zero, f 0 : ℝ. The math definition requires that the value at 0 should be equal to the limit of f at 0 from the right. If the limit is finite that's fine, we can simply require that f should be continuous at 0. However, if the limit at 0 is infinite, our current definition cannot encode the math definition. That was not an issue until now but it prevents us from writing desirable statements like the invariance of fDiv by taking the "dual" of f (see the issue about generalizing skew symmetry).
fDiv takes values in EReal, which is a pain to work with. We don't need the negative infinity though.
The use of a Bochner integral forces us to deal with integrability conditions.

Constraints

f has to have domain ℝ, and not the more natural ℝ≥0∞, because otherwise we can't talk about its derivatives and other basic things break.

Refactor 1

def rightLimZero (f : ℝ → ℝ) : EReal := Function.rightLim (fun x ↦ (f x : EReal)) (0 : ℝ)

def fDiv' (f : ℝ → ℝ) (μ ν : Measure α) : EReal :=
  if ¬ IntegrableOn (fun x ↦ f (μ.rnDeriv ν x).toReal) (ν.singularPartSet μ)ᶜ ν then ⊤
  else ∫ x in (ν.singularPartSet μ)ᶜ, f (μ.rnDeriv ν x).toReal ∂ν
    + derivAtTop f * μ.singularPart ν .univ + rightLimZero f * ν.singularPart μ univ

This does not change fundamentally the definition. It simply splits a third term, corresponding to the set where the Radon-Nikodym derivative is zero, and uses the right limit of f at 0.

This is a rather light refactor from a conceptual point of view. It's more a fix than a refactor. However, it will make all computations about fDiv heavier.

Refactor 2

def fDiv (f : ℝ → ℝ≥0∞) (μ ν : Measure α) : ℝ≥0∞ :=
    ∫⁻ x, f ((∂μ/∂ν) x).toReal ∂ν + derivAtTop f * μ.singularPart ν .univ

(in which derivAtTop f would be changed to be ℝ≥0∞)

This much more radical refactor does the following:

only allow nonnegative functions f : ℝ → ℝ≥0∞
enforce continuity at 0 (in lemmas). Since f can take value ∞, we can deal with the case of infinite right limit.
use a Lebesgue integral, which avoids talking about integrability
change the type of fDiv to have values in ℝ≥0∞

That definition was not used from the start because it seemed to me that allowing f to take negative values was important. After all, every book writes that the KL divergence is an f-divergence for x * log x, which takes negative values. However on probability measures it is always possible to add an affine function without changing the divergence while making the function nonnegative: x * log x + 1 - x gives the same f-divergence on probability measures.

We would have to do one of those two options:

change the definition of KL to be that f-divergence. The KL definition would be something like

def kl (μ ν : Measure α) : ℝ≥0∞ :=
if μ ≪ ν ∧ Integrable (llr μ ν) μ then ENNReal.ofReal (∫ x, llr μ ν x ∂μ) + ν .univ - μ .univ  else ∞

write that KL is equal to an f-divergence only on probability measures

We could change the type of all divergences to use ℝ≥0∞ if we go for option 1 for all of them.

An advantage of that second refactor is that it would be presumably better linked to integrals of statistical informations (which are necessarily nonnegative).

RemyDegenne / testing-lower-bounds