pymc-devs / pymc

Bayesian Modeling and Probabilistic Programming in Python
https://docs.pymc.io/
Other
8.48k stars 1.97k forks source link

Skewed Student-T distribution #7252

Closed fonnesbeck closed 2 months ago

fonnesbeck commented 2 months ago

Description

Added skewed Student T distribution (Jones and Faddy implementation).

Checklist

Type of change


πŸ“š Documentation preview πŸ“š: https://pymc--7252.org.readthedocs.build/en/7252/

ricardoV94 commented 2 months ago

Add in pymc-experimental first?

jessegrabowski commented 2 months ago

If something like this goes in experimental, there should be a clear roadmap to go from there back to the main repo. This is a straight-forward distribution that (I guess?) can't be easily implemented as a generative graph using only existing distributions via CustomDist. If it's in pmx, what's the criteria to move it over? It's not something like statespace or marginalmodel that proposes a huge new functionality, or something like r2d2m2 that implements a distribution that doesn't fit neatly into a typical pymc model as we present them in books/tutorials.

ricardoV94 commented 2 months ago

I don't think this one has a straightforward generative graph

ricardoV94 commented 2 months ago

I assumed something "according to x and y" was a bit experimental, hence my suggestion

fonnesbeck commented 2 months ago

The "Following Jones and Faddy 2003" was to distinguish it from the range of available skewed Student T implementations (there are 3 or 4). I chose this one because it is available in SciPy and relatively straightforward to implement.

I think it will be a useful distribution because it allows one to specify a likelihood that is both skewed and overdispersed and converges to normal when a and b are both large and equal.

fonnesbeck commented 2 months ago

Jax failure does not seem to be related to this PR

ricardoV94 commented 2 months ago

Jax failure does not seem to be related to this PR

It's not. Has been failing since the last scipy

ricardoV94 commented 2 months ago

But this one is: https://github.com/pymc-devs/pymc/actions/runs/8682226852/job/23806440256?pr=7252#step:7:616

You should try running those logp/logcdf/icdf with n_samples=-1 or whatever it is, to test all combinations locally instead of only a random subset

ricardoV94 commented 2 months ago

You're missing the tests for the RV itself. That's also where we cover alternative parametrizations. Something like these

https://github.com/pymc-devs/pymc/blob/34c2d31484143b7b5c6130927ed4efde93a27a0f/tests/distributions/test_continuous.py#L2179-L2200

codecov-commenter commented 2 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 92.35%. Comparing base (60a6314) to head (bcb1b5d). Report is 4 commits behind head on main.

Additional details and impacted files [![Impacted file tree graph](https://app.codecov.io/gh/pymc-devs/pymc/pull/7252/graphs/tree.svg?width=650&height=150&src=pr&token=JFuXtOJ4Cb&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pymc-devs)](https://app.codecov.io/gh/pymc-devs/pymc/pull/7252?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pymc-devs) ```diff @@ Coverage Diff @@ ## main #7252 +/- ## ========================================== + Coverage 91.67% 92.35% +0.68% ========================================== Files 102 102 Lines 17017 17069 +52 ========================================== + Hits 15600 15764 +164 + Misses 1417 1305 -112 ``` | [Files](https://app.codecov.io/gh/pymc-devs/pymc/pull/7252?dropdown=coverage&src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pymc-devs) | Coverage Ξ” | | |---|---|---| | [pymc/distributions/\_\_init\_\_.py](https://app.codecov.io/gh/pymc-devs/pymc/pull/7252?src=pr&el=tree&filepath=pymc%2Fdistributions%2F__init__.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pymc-devs#diff-cHltYy9kaXN0cmlidXRpb25zL19faW5pdF9fLnB5) | `100.00% <ΓΈ> (ΓΈ)` | | | [pymc/distributions/continuous.py](https://app.codecov.io/gh/pymc-devs/pymc/pull/7252?src=pr&el=tree&filepath=pymc%2Fdistributions%2Fcontinuous.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pymc-devs#diff-cHltYy9kaXN0cmlidXRpb25zL2NvbnRpbnVvdXMucHk=) | `97.93% <100.00%> (+0.09%)` | :arrow_up: | ... and [4 files with indirect coverage changes](https://app.codecov.io/gh/pymc-devs/pymc/pull/7252/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pymc-devs)
michaelosthege commented 2 months ago

Looks like test_skewstudentt_logp is failing on main, or is at least flaky?

ricardoV94 commented 2 months ago

These tests run a random subset of 100 combinations. We should set n_samples=-1 locally and see if every combination passes (both float64 and float32), I guess it does not.

CC @fonnesbeck

michaelosthege commented 1 month ago

Running float64 this gives:

E           AssertionError: 
E           Arrays are not almost equal to 6 decimals
E           {'a': array(0.01), 'b': array(100.), 'mu': array(-2.1), 'sigma': array(0.01), 'value': array(-0.01)}
E           x and y -inf location mismatch:
E            x: array(-751.339449)
E            y: array(-inf)
ricardoV94 commented 1 month ago

Which underflows? PyMC or Scipy?

michaelosthege commented 1 month ago

Which underflows? PyMC or Scipy?

SciPy produces the -inf.

What's the fix? I have it open and can push a branch real quick (if it's simple)

ricardoV94 commented 1 month ago

Which underflows? PyMC or Scipy?

SciPy produces the -inf.

What's the fix? I have it open and can push a branch real quick (if it's simple)

Try slightly less extreme parameter domains