alexpghayes / distributions3

Probability Distributions as S3 Objects
https://alexpghayes.github.io/distributions3/
Other
100 stars 16 forks source link

Extract fitted and predicted probability distributions from model objects #83

Closed zeileis closed 2 years ago

zeileis commented 2 years ago

In order to support/extend the vignette on Poisson GLMs and my forthcoming useR! 2022 presentation, I wrote a new generic function prodist() to extract probability distributions from model objects like lm, glm, and arima.

The idea is that authors of packages like betareg, pscl, and maybe even mgcv or gamlss can write prodist() methods for their model objects using the distribution objects from distributions3. This facilitates making and assessing fitted probability distributions in a unified way.

The accompanying manual page illustrates usage of the function. Tests are also included. The NEWS.md was updated with descriptions of this addition but also the other latest additions by Moritz and myself. (The DESCRIPTION still needs to be updated before the next CRAN release.)

alexpghayes commented 2 years ago

This is a very cool idea! I just took a look through and my one concern at the moment is the name. I had to read the implementation of prodist() before I understood what it was doing. I think calling the generic something like extract_distributions() or extract_estimated_distributions() would be more user friendly. Are you open to a name like this?

Logistical note: I will be away from computers from the 16th through the 21st. I'm going to give you write access so you can make last minute changes before your UseR talk without me holding you up.

zeileis commented 2 years ago

Alex, thanks you so much for this and sorry for the late response. We're also going away in a few hours but I'll be online sporadically.

Thanks!

alexpghayes commented 2 years ago

Function name: I should have explained that. We have a function procast() in topmodels that is doing the probabilistic forecast (in terms of probabilities or quantiles etc.) where the name is on the same level as predict() and forecast() (from the package of the same name). And prodist() would be a sibling function to procast() that provides the glue between the model object and the distribution from which the probabilities or quantiles etc. can be computed. I thought that was a nice combination. But as it's in a different package it's no problem to have a different name as well. So if you really don't like the prodist(), that's ok. Having said that, I'm not super fond of a name like extract_*() because we don't have extract_mean() or extract_quantile() or even extract_quantiles() for the distributions either.

For now let's use prodist() but I think eventually I will make this an alias for another name. Have been thinking on this for a bit and have some thoughts, will share later today when I'm at the computer for longer.

Logistics: I think that apart from the PR we are currently discussing, I would only update the DESCRIPTION (version 0.2.0 maybe? and me as aut?) and then make a new CRAN release. My presentation at useR! is on the 23rd, though, so we still have some time.

Yes definitely add yourself as author!

Am back in town today, will merge in a couple hours after you yourself as an author and I can submit to CRAN shortly after that!

zeileis commented 2 years ago

Thanks, Alex, this would be great! Adding another alias for prodist() later is also a good idea. Let me know if I can help with anything for the CRAN release.