Closed honzabim closed 4 years ago
IPMeasures is surely relevant, I am just not sure how much it is maintained and used. I do not mind having external dependencies, but they must be alive. Should we integrate IPMeasures into GenerativeModels or create an ecosystem with multiple interacting packages?
I am in favour of using multiple, small (and hopefully transparent) packages. I see no reason why to not reuse the stuff that is already implemented in IPMeasures. @honzabim can you get us write access to that?
You have to ask pevnak for access. That is what I meant by an ecosystem. We should rely on it only if we can keep it up to date, e.g. by a write access.
I think we could use the Distances
package - it is quite well maintained.
OK, I agree with using distances but then we still need MMD and its kernels... do you think there is a package for that as well? Or should we ask TOmas to get write access to IPmeasures and use the MMD from there + Distances package? (I am pretty sure Tomas will give us write access without any problem)
Sry, missclicks...
I am strongly for using only registered and maintained packages. If there is none for kernels (I have not found any), then I think we should implement it ourselves here.
Pull request to Distances?
You want to add MMD to Distances?
On Fri, 8 Nov 2019 at 14:32, Vaclav Smidl notifications@github.com wrote:
Pull request to Distances?
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/nmheim/GenerativeModels.jl/issues/50?email_source=notifications&email_token=AEEZ6VCTDRLSHNVNHYB7UUDQSVS5HA5CNFSM4JKGZNVKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDR4MEY#issuecomment-551798291, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEEZ6VAOOQ3LAEU5VT2ETFDQSVS5HANCNFSM4JKGZNVA .
Why not? It is a distance. Just do not know how much they welcome contributions...
Yeah, that's the potential issue with a package that we cannot control. Since we want to keep kernels in our package anyway, I think it would be nice to take control over the IPmeasures package and have it all there, including MMD but change it to use Distances instead of that pairwise L2, Cos etc. That keeps Generative Models fairly clean and enables us to change it whenever we want while still reusing as much as possible.
On Fri, 8 Nov 2019 at 15:37, Vaclav Smidl notifications@github.com wrote:
Why not? It is a distance. Just do not know how much they welcome contributions...
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/nmheim/GenerativeModels.jl/issues/50?email_source=notifications&email_token=AEEZ6VB4K7IRV75ORPMPH7TQSV2Q3A5CNFSM4JKGZNVKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDSJNLI#issuecomment-551851693, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEEZ6VA7VZRGE4BN7UMRITTQSV2Q3ANCNFSM4JKGZNVA .
Has anyone used Distances.jl
before?
Their implementation of KLD does not look like what we want: https://github.com/JuliaStats/Distances.jl/blob/7f3a28c0d1372e3b3edbcbc28f00ba5645e1bbdb/src/metrics.jl#L366
Well, that KLD is actually correct as far as I can tell. Of course, it is better to use the analytical solution where known (like 2 Gaussians) but theoretically, this would work with VM-F the same way MMD does. Take large enough samples and go. However, that isn't the idea here I guess... it is ok to have our KLD for gaussians...
I thought we could use Distances.jl either for the distance in kernels in mmd, or add the whole mmd in there but I am not sure how responsive they are. It seems like one of the slower communities.
On Fri, 15 Nov 2019 at 00:48, Niklas Heim notifications@github.com wrote:
Has anyone used Distances.jl before? Their implementation of KLD does not look like what we want: https://github.com/JuliaStats/Distances.jl/blob/7f3a28c0d1372e3b3edbcbc28f00ba5645e1bbdb/src/metrics.jl#L366
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/nmheim/GenerativeModels.jl/issues/50?email_source=notifications&email_token=AEEZ6VEH347CNKK5FT7ZMTLQTXPUJA5CNFSM4JKGZNVKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEDXY4A#issuecomment-554138736, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEEZ6VFZTKX274DKJOUJZKTQTXPUJANCNFSM4JKGZNVA .
Yes, there is an issue from three years ago that asked to implement wasserstein, but nothing cam out of it and then earth movers distance was implemented in a seperate package. What do you think about implementing MMD and KLD in IPMeasures.jl
such that they conform to Distances.jl
?
If we can take full control of it like Honza has suggested, then I would do it.
yes, we will have full control, just got write access to IPMeasures :)
Yep, totally agreed! Let's use IPMeasures as a base... I think the MMD is quite well done in there so no need to rewrite that. We can just use the pairwise distances from Distances.jl. KLD, sure, even though that has much lower priority now I think. I think that the best would be to actually have all this as part of Distributions.jl because that is what we really want. But one can't differentiate through those as far as I know so...
On Fri, 15 Nov 2019 at 11:12, Niklas Heim notifications@github.com wrote:
yes, we will have full control, just got write access to IPMeasures :)
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/nmheim/GenerativeModels.jl/issues/50?email_source=notifications&email_token=AEEZ6VCXVFIMGETUNA37MILQTZYZRA5CNFSM4JKGZNVKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEE7CCA#issuecomment-554299656, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEEZ6VGD64ERIAX7AGEGZYDQTZYZRANCNFSM4JKGZNVA .
Implemeted in IPMeasures.jl
The distance in our kernels is fixed such as: `function rbf(x,y,σ) d = x-y exp.(-(sum(d . d,dims=1)/(2σ))) end
rbf(x,y,σ) = exp.(-(sum((x-y).^2,dims=1)/(2*σ)))`
and
function imq(x,y,σ) d = x-y σ./(σ.+sum((d .* d),dims=1)) end
I think we should add different distances as well as that might be quite handy (and e.g. for S-VAE only cosine distance makes sense). We can get inspired here. It is a packaged made by Tomas. Shouldn't we actually just use that instead of retyping that? I think it would also fix #42