WIP: `pscl::hurdle` - Githubissues

vincentarelbundock / marginaleffects

R package to compute and plot predictions, slopes, marginal means, and comparisons (contrasts, risk ratios, odds, etc.) for over 100 classes of statistical and ML models. Conduct linear and non-linear hypothesis tests, or equivalence tests. Calculate uncertainty estimates using the delta method, bootstrapping, or simulation-based inference

https://marginaleffects.com

Other

439 stars 45 forks source link

WIP: `pscl::hurdle` #45

Closed vincentarelbundock closed 3 years ago

vincentarelbundock commented 3 years ago

---
output: html_document
---

```{r, echo = TRUE, message=FALSE}
library("pscl")
library("modelsummary")
library("marginaleffects")

data("bioChemists", package = "pscl")
mod <- hurdle(art ~ phd + fem | ment, data = bioChemists, dist = "negbin")

mfx <- marginaleffects(mod, prediction_type = c("zero", "count"))

modelsummary(mfx, group = type + term ~ model)

vincentarelbundock commented 3 years ago

What happens when I modify only a 1st or 2nd equation coefficient in set_coef? How do the predictions change for zero and count?

vincentarelbundock commented 3 years ago

The predict() method does not expect you to do evil modifications of the coefficients. Hence:

if(missing(newdata) & type == "response") return(object$fitted.values)

So as long as you don't set newdata, the response predictions will always be the fitted values.

Another potential source of confusion: The predict(..., type == "zero") is (regrettably) not P(Y = 0 | x) but the probability of zero inflation, i.e., the ratio of the probability for zero from the hurdle component divided by that from the count component.

vincentarelbundock commented 3 years ago

P(Y = 0 | x) is available as predict(..., type = "prob", at = 0). It's just that our labeling from 15 years ago turned out to be confusing for many users. Also we plan to improve this in the "countreg" package and also provide further prediction types.

vincentarelbundock commented 3 years ago

Why does predict(mod, type = "prob", at = 0 break?

This would give us the P(Y=0|x) that we want.

Edit: Not handled properly at the countreg level.

vincentarelbundock commented 3 years ago

In principle, I think this works now. Still have to check validity against external software, but this issue is closed, I think. Feel free to reopen or open a new one if needed.