stan-dev / loo

loo R package for approximate leave-one-out cross-validation (LOO-CV) and Pareto smoothed importance sampling (PSIS)
https://mc-stan.org/loo
Other
150 stars 34 forks source link

`loo_R2` can fail on examples where it worked before #263

Closed bgoodri closed 6 months ago

bgoodri commented 6 months ago

The newest loo package seems to be less robust than previous versions when calculating loo_R2. From the NES example,

data("nes", package = "rosdata")
ok <- nes$year==1992 & !is.na(nes$rvote) & !is.na(nes$dvote) & (nes$rvote==1 | nes$dvote==1)
library(rstanarm)
fit_1 <- stan_glm(rvote ~ income, family=binomial(link="logit"), data=nes92, refresh = 0)
loo_R2(fit_1)

This fails with the warning message "Input contains infinite or NA values, Pareto smoothing not performed." and error message "Error: $ operator is invalid for atomic vectors". Neither of these messages is very helpful. The warning results from posterior:::should_return_NA but it is due to the input being within tol of a constant rather than infinite or NA values. The error comes from loo:::.E_loo_khat_i due to the fact that posterior::pareto_khat returns the scalar NA rather than a list in this situation, so trying to select the $khat element fails.

This all worked with loo 2.6.0.

jgabry commented 6 months ago

That's unfortunate. @avehtari this seems to be a consequence of switching to using some of the functions from posterior. I guess we missed this difference and didn't have a test to detect it. I think posterior::pareto_khat should always return a list instead of having different behavior depending on if the result is NA (so it should just return list(NA)), but that would now break backwards compatibility in posterior if we change that. I guess we can just work around this in loo. I'll submit a PR.

avehtari commented 6 months ago

Thanks for the report!

The newest loo package seems to be less robust than previous versions

I guess it is as robust as before, as making it more robust for other packages was countered by this one. With @jgabry's fix it hopefully now is then more robust for all use cases.

n-kall commented 6 months ago

I think posterior::pareto_khat should always return a list instead of having different behavior depending on if the result is NA (so it should just return list(NA)), but that would now break backwards compatibility in posterior if we change that.

I think this is a sensible change, and I don't think there are many packages yet using posterior::pareto_khat to prevent changing. I will make an issue in posterior repo

jgabry commented 5 months ago

I think posterior::pareto_khat should always return a list instead of having different behavior depending on if the result is NA (so it should just return list(NA)), but that would now break backwards compatibility in posterior if we change that.

I think this is a sensible change, and I don't think there are many packages yet using posterior::pareto_khat to prevent changing. I will make an issue in posterior repo

Thanks. We’ll just have to remember to change loo when you do that, since that may break the workaround we put in loo. But it will be a simple change in loo when that happens.