stanfordmlgroup / ngboost

Natural Gradient Boosting for Probabilistic Prediction
Apache License 2.0
1.64k stars 215 forks source link

Feature/add t distn #190

Closed angeldroth closed 3 years ago

angeldroth commented 3 years ago

closes #189 Not a fully finished implementation because I haven't been able to calculate the fisher for the students-t wrt. log(sigma) rather than just sigma. The degrees of freedom were fixed at 3 because of the issues with local minima.

This stack exchange gives a general univariate overview for the normal fisher derivation but its somewhat incomplete. If anyone has some resources they can point me to for the full fisher derivation I will try and do it wrt log(sigma).

alejandroschuler commented 3 years ago

This is great, thank you for the contribution! I usually use wolframalpha to crunch the derivatives and integrals and whatnot- lmk if that helps. Just plug in e^log_sigma instead of sigma wherever the formula calls for it. For the metric you'll have to first find the formula for the derivative of the log likelihood (w.r.t. to log_sigma), square it, multiply by the probability density function, and integrate w.r.t. the random variable (Y). If wolframalpha craps out on the computation you can learn a little bit of syntax and try wolfram cloud.

angeldroth commented 3 years ago

That sounds awesome, I'll take a look at doing that now.

angeldroth commented 3 years ago

dnll_dlog(sigma) Proof that the by hand calculation of derivative is correct. Where sigma = e^s in this case.

angeldroth commented 3 years ago

The definite integrals seem to time out on both wolfram alpha and wolfram cloud :/ image

alejandroschuler commented 3 years ago

@angeldroth I'm going to DL a trial version of wolfram desktop and see if I can crunch the integral. Will report back shortly.

alejandroschuler commented 3 years ago

Screen Shot 2020-10-15 at 3 25 38 PM

looks like success! I recommend downloading the trial (takes a while to download) and seeing if you can do these integrals using that. Otherwise if you want to just type them out and paste them here I can do it in my trial.

angeldroth commented 3 years ago

Awesome! Do the definite integrals (from -inf to inf) also work? I think I was able to get the indefinite ones working using online wolfram alpha too.

Let me download the trail and take a look myself, I still have the equations ready to copy paste directly.

ryan-wolbeck commented 3 years ago

@angeldroth with the merge of https://github.com/stanfordmlgroup/ngboost/pull/192 can you merge those changes from master into this PR and bump the version.py to version = "0.3.8dev"? When this is approved I'll bump the base version on pypi

angeldroth commented 3 years ago

@ryan-wolbeck done

angeldroth commented 3 years ago

Asked for help on stack exchange here.

angeldroth commented 3 years ago

Thanks to @mzjp2 for pointing out the Reparametrization section on Fisher Info Wikipedia. Makes the exact t-fisher super easy do derive: just multiply the derivation from here with sigma^2.

I would really suggest linking to this section in the developer guide because it probably makes future contributions much easier :)

angeldroth commented 3 years ago

image

Here is my working for taking the derivative of the log likelihood wrt log(deg freedom). The phi term is the digamma function. This is implemented in the TLogScore class to allow learning of the deg freedom parameter. Should I break this out into another PR or is it cool to leave in this one?

alejandroschuler commented 3 years ago

image

Here is my working for taking the derivative of the log likelihood wrt log(deg freedom). The phi term is the digamma function. This is implemented in the TLogScore class to allow learning of the deg freedom parameter. Should I break this out into another PR or is it cool to leave in this one?

fine by me if you leave it in this PR. lmk when you're ready for a review- looking good as-is.

alejandroschuler commented 3 years ago

@angeldroth I see you've also added a few versions of the T with certain parameters fixed. If you're feeling ambitious it might be interesting to write a function that can fix parameters of arbitrary distributions to produce new ones (this would be a different PR). So you could write something like:

from ngboost.distns import Normal
from ngboost.distns.utils import fix         # new

NormalFixedVar = fix(Normal, scale=1)

Basically you'd have to write dynamic wrappers for all the methods of the distribution as well as for the corresponding scores. The dynamically-generated scores (e.g. what would be NormalFixedVarLogScore) will only exist inside of NormalFixedVar.scores.

mzjp2 commented 3 years ago

@angeldroth I see you've also added a few versions of the T with certain parameters fixed. If you're feeling ambitious it might be interesting to write a function that can fix parameters of arbitrary distributions to produce new ones (this would be a different PR). So you could write something like:

from ngboost.distns import Normal
from ngboost.distns.utils import fix         # new

NormalFixedVar = fix(Normal, scale=1)

Basically you'd have to write dynamic wrappers for all the methods of the distribution as well as for the corresponding scores. The dynamically-generated scores (e.g. what would be NormalFixedVarLogScore) will only exist inside of NormalFixedVar.scores.

Oh this is quite cool - very similar to the way one would use functools.partial :)

angeldroth commented 3 years ago

Good idea, let me take a look this morning. I might opt to go the factory route if it turns out to be neater.

ryan-wolbeck commented 3 years ago

@angeldroth just a heads up that there have been a couple changes to master that you'll want to merge into this pr

angeldroth commented 3 years ago

Hey sorry guys, been a very busy past few weeks. Will get this sorted today

angeldroth commented 3 years ago

@alejandroschuler synced, merge at will. I might open another PR in a few days doing the factory stuff we talked about - may be neater

ryan-wolbeck commented 3 years ago

I re-ran the tests and got the following

============================================================================================================ short test summary info ============================================================================================================ FAILED test_distns.py::test_dists_runs_on_examples[T-CRPScore-learner14] - ValueError: The scoring rule CRPScore is not implemented for the T distribution. FAILED test_distns.py::test_dists_runs_on_examples[T-CRPScore-learner15] - ValueError: The scoring rule CRPScore is not implemented for the T distribution. FAILED test_distns.py::test_dists_runs_on_examples[TFixedDf-LogScore-learner16] - IndexError: index 1 is out of bounds for axis 1 with size 1 FAILED test_distns.py::test_dists_runs_on_examples[TFixedDf-LogScore-learner17] - IndexError: index 1 is out of bounds for axis 1 with size 1 FAILED test_distns.py::test_dists_runs_on_examples[TFixedDf-CRPScore-learner18] - ValueError: The scoring rule CRPScore is not implemented for the TFixedDf distribution. FAILED test_distns.py::test_dists_runs_on_examples[TFixedDf-CRPScore-learner19] - ValueError: The scoring rule CRPScore is not implemented for the TFixedDf distribution. FAILED test_distns.py::test_dists_runs_on_examples[TFixedDfFixedVar-CRPScore-learner22] - ValueError: The scoring rule CRPScore is not implemented for the TFixedDfFixedVar distribution. FAILED test_distns.py::test_dists_runs_on_examples[TFixedDfFixedVar-CRPScore-learner23] - ValueError: The scoring rule CRPScore is not implemented for the TFixedDfFixedVar distribution. FAILED test_distns.py::test_dists_runs_on_examples[Cauchy-LogScore-learner24] - IndexError: index 1 is out of bounds for axis 1 with size 1 FAILED test_distns.py::test_dists_runs_on_examples[Cauchy-LogScore-learner25] - IndexError: index 1 is out of bounds for axis 1 with size 1 FAILED test_distns.py::test_dists_runs_on_examples[Cauchy-CRPScore-learner26] - ValueError: The scoring rule CRPScore is not implemented for the Cauchy distribution. FAILED test_distns.py::test_dists_runs_on_examples[Cauchy-CRPScore-learner27] - ValueError: The scoring rule CRPScore is not implemented for the Cauchy distribution. =========================================================================================== 12 failed, 34 passed, 21030 warnings in 221.19s (0:03:41) =========================================================================================== (ngboost) ➜ tests git:(feature/add_t_distn)

I have not dug into the why it's happening but I'll review again the morning

Running tests : pytest --slow -v

mzjp2 commented 3 years ago

Ah yes, I suppose this is intended because only the LogScore's have been implemented for the distributions added in this PR, but the parametrized test functions test both LogScore and CRPScore because of the product_list way we do parametrisations in test_distns.py

angeldroth commented 3 years ago

@ryan-wolbeck @alejandroschuler I've just spend the past half day getting circleci + codecov working in a personal repo. If that's a direction you would like to take the testing framework in I wouldn't mind copy/pasting the config stuff I sorted out into another PR

ryan-wolbeck commented 3 years ago

@angeldroth I think that would be good, feel free to submit the PR for that! Side note, I pushed a formatting change and I'm re-running tests on my side. I'll likely merge this in when it's done

ryan-wolbeck commented 3 years ago

@angeldroth @mzjp2 Thanks for the work on this, great job