TuringLang / Bijectors.jl

Implementation of normalising flows and constrained random variable transformations
https://turinglang.org/Bijectors.jl/
MIT License
200 stars 33 forks source link

Relax type constraints for `Stacked` #177

Closed torfjelde closed 3 years ago

torfjelde commented 3 years ago

IMO the current implementation of Stacked constrains the "container"-types a bit too much. There are cases where forcing the use of tuple can lead to absolutely insane compilation times, e.g. if you're stacking a bunch of univariate bijectors.

torfjelde commented 3 years ago

Tracker seems to be ruining tests here too :confused:

devmotion commented 3 years ago

It seems tests time out on master and this PR after/while running the PoissonBinomial tests :thinking:

torfjelde commented 3 years ago

It seems tests time out on master and this PR after/while running the PoissonBinomial tests thinking

Do you know if this also occurs on DistributionsAD? Seems like it shouldn't be due to something in Bijectors.

devmotion commented 3 years ago

Do you know if this also occurs on DistributionsAD?

No, the error does not occur in the tests in DistributionsAD.

Red-Portal commented 3 years ago

Hi @torfjelde, I just have a quick question, is the "performance bug" this PR is trying to solve the reason why bijectors is really really slow for models like ppca?

torfjelde commented 3 years ago

No, this PR doesn't address anything performance-related. Is there an open issue to what you're referring to?

Red-Portal commented 3 years ago

I experienced that Bijectors is almost unusably slow with models with many univariate densities such as PPCA (the likelihood is a product of univariate Poisson densities). I thought this bit

There are cases where forcing the use of tuple can lead to absolutely insane compilation times

of your original description is somewhat related to that. No?

torfjelde commented 3 years ago

Ah I see. Well, that's the compile-time though; the runtime should be better.

But this doesn't have anything to do with what you're talking about because no Stacked are constructed when running a Turing-model. Furthermore, Bijectors aren't used for the likelihood in a model; transformations are only necessary for rvs. On top of this, Poisson is a discrete distribution which also doesn't use a bijector (if it does, it's a Identity which is a no-op: https://github.com/TuringLang/Bijectors.jl/blob/4fcda273c3a317677aa1bc96434597a39b5f459a/src/transformed_distribution.jl#L42)

Btw, why do you think your issue is with Bijectors rather than something else?

If you have an MWE model and an issue, it will be much easier to help:)

Red-Portal commented 3 years ago

Yes, I was not talking specifically about the Turing model but the bijector constructed when performing VI. Even if the bijector is identity as you pointed out, the first call for forward was insanely slow, which I suspected is a result of the compile-time issue.

I haven't checked if this is still the case, so I'll come back and create a new issue if it still is. Thanks for the answers!

torfjelde commented 3 years ago

Ah, yeah that makes sense (though again, it shouldn't matter for the likelihood as no bijector is constructed for it). But this is more an artifact of the bijector definition for a Model: it's not optimal. There are ways to work around this though, so if you raise an issue I can help out :+1: