Perturbed oracles with variance reduction

BatyLeo commented 1 year ago

This PR reworks all the AbstractPerturbed related structs:

Most methods (including the rrule) are now generic and implemented directly on AbstractPerturbed
As a kwarg option enabled by default, gradients can be computed with variance reduction
Additive and multiplicative perturbations still exist as a special case, and are compatible with Fenchel-Young losses
Custom distributions can be used for additive and multiplicative perturbations
New PerturbedOracle layer, generalizing additive/multiplicative perturbations, where the perturbation can have any form. It cannot be used with Fenchel-young losses
Tests for new options
New tests checking that a PerturbedOracle with the correct distribution coincides with PerturbedAdditive and PerturbedMultiplicative

codecov-commenter commented 1 year ago

Codecov Report

Patch coverage: 91.22% and project coverage change: +0.67% :tada:

Comparison is base (44a3055) 84.75% compared to head (1e0d8ec) 85.42%. Report is 3 commits behind head on main.

:exclamation: Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

Additional details and impacted files

```diff @@ Coverage Diff @@ ## main #80 +/- ## ========================================== + Coverage 84.75% 85.42% +0.67% ========================================== Files 18 19 +1 Lines 341 391 +50 ========================================== + Hits 289 334 +45 - Misses 52 57 +5 ``` | [Files Changed](https://app.codecov.io/gh/axelparmentier/InferOpt.jl/pull/80?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None) | Coverage Δ | | |---|---|---| | [src/InferOpt.jl](https://app.codecov.io/gh/axelparmentier/InferOpt.jl/pull/80?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None#diff-c3JjL0luZmVyT3B0Lmps) | `100.00% <ø> (ø)` | | | [src/perturbed/perturbed\_oracle.jl](https://app.codecov.io/gh/axelparmentier/InferOpt.jl/pull/80?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None#diff-c3JjL3BlcnR1cmJlZC9wZXJ0dXJiZWRfb3JhY2xlLmps) | `85.71% <85.71%> (ø)` | | | [src/perturbed/abstract\_perturbed.jl](https://app.codecov.io/gh/axelparmentier/InferOpt.jl/pull/80?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None#diff-c3JjL3BlcnR1cmJlZC9hYnN0cmFjdF9wZXJ0dXJiZWQuamw=) | `86.20% <88.00%> (-3.80%)` | :arrow_down: | | [src/perturbed/additive.jl](https://app.codecov.io/gh/axelparmentier/InferOpt.jl/pull/80?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None#diff-c3JjL3BlcnR1cmJlZC9hZGRpdGl2ZS5qbA==) | `88.23% <93.33%> (+4.90%)` | :arrow_up: | | [src/perturbed/multiplicative.jl](https://app.codecov.io/gh/axelparmentier/InferOpt.jl/pull/80?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None#diff-c3JjL3BlcnR1cmJlZC9tdWx0aXBsaWNhdGl2ZS5qbA==) | `88.23% <93.33%> (+4.90%)` | :arrow_up: | | [src/imitation/fenchel\_young\_loss.jl](https://app.codecov.io/gh/axelparmentier/InferOpt.jl/pull/80?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None#diff-c3JjL2ltaXRhdGlvbi9mZW5jaGVsX3lvdW5nX2xvc3Muamw=) | `87.75% <100.00%> (ø)` | |

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

gdalle commented 1 year ago

I won't have time to review this, and I probably need to take a step back to focus on my own research. It would be nice if you and @LouisBouvier could rely on each other for PR reviewing, can you bring him up to speed?

gdalle commented 1 year ago

As far as custom distributions are concerned, it would be nice but we need a gradlogpdf function if memory serves me well. This is not necessarily easy to find

gdalle commented 1 year ago

And I agree that Fenchel Young is messy. This implementation tried to minimize code duplication, but possibly at the expense of clarity.

BatyLeo commented 1 year ago

As far as custom distributions are concerned, it would be nice but we need a gradlogpdf function if memory serves me well. This is not necessarily easy to find

The gradlogpdf can either be given by the user or computed with autodiff, this is already the case in the new generic PerturbedOracle struct, but it cannot be used with FenchelYoungLoss at the moment

gdalle commented 1 year ago

The gradlogpdf can either be given by the user or computed with autodiff

Will we have a problem with nested autodiff?

gdalle commented 1 year ago

Maybe to avoid random failures across Julia versions we could use StableRNGs in testing?

BatyLeo commented 1 year ago

The gradlogpdf can either be given by the user or computed with autodiff

Will we have a problem with nested autodiff?

What do you mean by "nested autodiff" ?

Maybe to avoid random failures across Julia versions we could use StableRNGs in testing?

I'm still investigating, but it seems that even whithin the same Julia versions our tests are not deterministic. The failing test passes on my laptop about half the time

gdalle commented 1 year ago

What do you mean by "nested autodiff" ?

Differentiating something with Zygote that has already been differentiated with it once. I don't think it will be the case here

BatyLeo commented 1 year ago

Should we enable variance reduction by default ? Currently it is not.

gdalle commented 1 year ago

We're gonna tag a breaking release anyway, we might as well

LouisBouvier commented 1 year ago

Can we use variance reduction also in FY loss ? We also estimate an expectation by Monte-Carlo, when we use more than one sample we could think about it ?

gdalle commented 1 year ago

Since questions are popping up on the Julia forum, is this ready to go?

BatyLeo commented 1 year ago

Since questions are popping up on the Julia forum, is this ready to go?

I think it is, Louis already reviewed it. Maybe check the AbstractPerturbed rrule once more ?

gdalle commented 1 year ago

I trust you guys, you do the merge this time ;)

gdalle commented 1 year ago

Congrats, this was huge work!

JuliaDecisionFocusedLearning / InferOpt.jl

Perturbed oracles with variance reduction #80

Codecov Report