TuringLang / Bijectors.jl

Implementation of normalising flows and constrained random variable transformations
https://turinglang.org/Bijectors.jl/
MIT License
200 stars 33 forks source link

Update to ChainRules 1 #198

Closed devmotion closed 3 years ago

devmotion commented 3 years ago

This PR updates Bijectors to ChainRules 1. Currently Bijector holds back CR in DynamicPPL (and hence also in Turing).

devmotion commented 3 years ago

The failing tests are caused by NNlib (same issue as described in https://github.com/TuringLang/DistributionsAD.jl/pull/198#issuecomment-905032503): The CR1 compatible versions of NNlib require Julia 1.6. However, in contrast to DistributionsAD Bijectors actually depends on NNlib and it's not just a test dependency. It seems we use NNlib.softmax and NNlib.softplus.

yebai commented 3 years ago

It’s probably ok to ignore test failures for Julia 1.3 here, I think. We can drop support for Julia 1.3 slightly later to avoid the maintenance burden since many users would be using more recent Julia releases anyway.

devmotion commented 3 years ago

It’s probably ok to ignore test failures for Julia 1.3 here, I think. We can drop support for Julia 1.3 slightly later to avoid the maintenance burden since many users would be using more recent Julia releases anyway.

If we update ChainRulesCore, the package will only be installable on Julia 1.6 due to NNlib. I reintroduced compatibility with ChainRulesCore 0.10 in which the changes in CR 1 were introduced but unfortunately it does not help since NNlib with Julia 1.3 is only compatible with CRC 0.9 (https://github.com/FluxML/NNlib.jl/blob/v0.7.18/Project.toml). So even if we do not drop support for Julia < 1.6 officially and ignore the tests, users with older Julia versions won't be able to use future Bijectors versions anymore (and hence they won't be able to use Turing with CRC 1).

So, I think we can either remove the NNlib dependency and implement softmax and softplus ourselves (in principle, we could/should use the versions in LogExpFunctions but we will have to implement AD support for them; which is either type piracy or would have to go into LogExpFunctions (for CR) and Tracker etc., so it takes time; related: https://github.com/FluxML/NNlib.jl/issues/252) or we drop support for Julia < 1.6 officially.

torfjelde commented 3 years ago

Should we run a poll on Slack to check how many people are actually using <1.6?

devmotion commented 3 years ago

Maybe we just drop support for it? There's already beta testing for 1.7 and hence soon it will become the LTS version. And it won't break anything for people on older versions, they just won't get any new features and updates of Bijectors.

torfjelde commented 3 years ago

Tbh I'm fine with it :shrug:

devmotion commented 3 years ago

This PR is not ready. If we decide to drop Julia < 1.6 we have to change the compat bounds and the tests.

However, I tend more and more to think that it's not a good idea to raise the bound and instead we should remove the NNlib dependency. LogExpFunctions supports CRC 1 and both softplus/log1pexp and softmax are defined in LogExpFunctions as well (hopefully at some point they will be the same, I don't think there's a good reason to have two implementations). More importantly though, dropping support for Julia < 1.6 affects more packages than just Turing and DynamicPPL. E.g., if we only allow Julia >= 1.6, then nobody will be able to use ParameterHandling + CRC1 on Julia < 1.6, even though CRC 1 supports Julia 1.0 and ParameterHandling Julia 1.3 (even though I wonder why ParameterHandling wants/has to depend on a heavy package such as Bijectors in the first place: https://github.com/invenia/ParameterHandling.jl/issues/24). So I think it's nice for the package ecosystem if we don't drop it (yet).

I'll open a PR and check if/how much is broken if we replace NNlib with LogExpFunctions.

torfjelde commented 3 years ago

I mentioned this in slack, but how about we move the bijector which depends on NNLib and NonLinearSolve to a different package, e.g. NormalizingFlows.jl or something (ideally we'd have a name that clearly relates to Bijectors.jl)?

EDIT: Maybe it's better to start by just moving the core-functionality to a BijectorsCore.jl package or something where we drop dependencies on:

This would also make it easier for something like MeasureTheory.jl to take it on as a dep.

We could then think about separating out further at later stages if it makes sense, e.g. NormalizingFlows.jl.

yebai commented 3 years ago

Maybe we can move the normalising flow, and other VI related stuff to AdvancedVI? Then, we can gradually reduce the dependency footprints of Bijectors until it is a lightweight package like AdvancedHMC. I slightly lean towards keeping Bijectors self-contained and lightweight, instead of spreading out the code into several packages.

torfjelde commented 3 years ago

Maybe we can move the normalising flow, and other VI related stuff to AdvancedVI? Then, we can gradually reduce the dependency footprints of Bijectors until it is a lightweight package like AdvancedHMC. I slightly lean towards keeping Bijectors self-contained and lightweight, instead of spreading out the code into several packages.

I don't like the idea of moving those to AdvancedVI.jl tbh. Those transformations are useful for more than just VI.

But what about the BijectorsCore.jl idea? I do think Bijectors.jl is too bloated atm.

devmotion commented 3 years ago

I replaced NNlib and StatsFuns with LogExpFunctions and IrrationalConstants. Tests pass locally but require https://github.com/JuliaStats/LogExpFunctions.jl/pull/28, so CI will fail :slightly_smiling_face:

devmotion commented 3 years ago

LogExpFunctions 0.3.3 was released, tests seem to pass now.

torfjelde commented 3 years ago

Awesome stuff @devmotion :heart: