tensorflow / probability

Probabilistic reasoning and statistical analysis in TensorFlow
https://www.tensorflow.org/probability/
Apache License 2.0
4.26k stars 1.1k forks source link

Plans for a Hypergeometric distribution implementation? #1473

Open chrism0dwk opened 2 years ago

chrism0dwk commented 2 years ago

Hi All, Just wondering if there are any plans to introduce a (Hypergeometric distribution)[https://en.wikipedia.org/wiki/Hypergeometric_distribution] into TFP? Seems a slight hole in the otherwise comprehensive arsenal. I'd be happy to co-create....

Chris

brianwa84 commented 2 years ago

Are you aware of any reasonable sampling approaches? We could fall back to https://github.com/tensorflow/probability/blob/main/tensorflow_probability/python/distributions/discrete_rejection_sampling.py#L40 for log-concave discrete.

Brian Patton | Software Engineer | @.***

On Fri, Dec 3, 2021 at 4:06 AM Chris Jewell @.***> wrote:

Hi All, Just wondering if there are any plans to introduce a (Hypergeometric distribution)[https://en.wikipedia.org/wiki/Hypergeometric_distribution] into TFP? Seems a slight hole in the otherwise comprehensive arsenal. I'd be happy to co-create....

Chris

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tensorflow/probability/issues/1473, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFJFSI2BHI6OJIQE6D77DX3UPCCBLANCNFSM5JJF2WVA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

chrism0dwk commented 2 years ago

Here's a very quick and dirty method, the memory requirement of which is linear wrt the total population size: https://colab.research.google.com/drive/1nhzVpVIQID1Z36U9zYteWkQr7QMQEJhV?usp=sharing

chrism0dwk commented 2 years ago

Hmm, so I concede that the call to tf.math.top_k isn't vectorizable in the above example. Any ideas?

Oh, and HNY to everyone! Let 2022 be the best so far..

chrism0dwk commented 2 years ago

...and isn't XLAable if N changes inside the compiled block.

chrism0dwk commented 2 years ago

@brianwa84 investigating discrete_rejection_sampling.py (thanks for the tip!) but for some reason this file isn't being included in the tensorflow_probability==0.15.0 PyPI build. Is that intended?

srvasude commented 2 years ago

FWIW here's a sampler based on Stadlober's paper: https://colab.research.google.com/drive/1uARQ9ojLmlMGRGOYD1cqOe6raLv9XsHx?usp=sharing

It needs a bit of cleaning up in the name department, but hopefully can just be plugged in to your Hypergeometric distribution.

srvasude commented 2 years ago

Made this a github gist since there seem to be some permission problems: https://gist.github.com/srvasude/cb4a457a5acbb57614be7b970e62cda1#file-hypergeometric-sampling-ipynb