Open fweber144 opened 7 months ago
@AlejandroCatalina: Is line looR2[looR2 < -1] <- -1
supposed to read looR2[looR2 < 0] <- 0
?
@avehtari: The SE formula provided in https://github.com/stan-dev/loo/pull/205#issuecomment-1316683962 refers to LOO - $R^2$. I guess it cannot be applied directly to K-fold CV, no CV (i.e., test dataset = training dataset), or a hold-out test dataset. Do you know of similar formulas for those cases?
Is line looR2[looR2 < -1] <- -1 supposed to read looR2[looR2 < 0] <- 0?
The first one is intentional.
The SE formula provided in https://github.com/stan-dev/loo/pull/205#issuecomment-1316683962 refers to LOO - . I guess it cannot be applied directly to K-fold CV
Can be used with K-fold-CV and pointwise evaluation.
no CV (i.e., test dataset = training dataset)
We have used Bayesian-R2 for that as it has some benefits in that case, but the same formula could be used, too
or a hold-out test dataset
Can be used
As suggested by @avehtari, it would be good to have $R^2$ as a performance statistic in projpred. This could be called
stats = "R2"
(andstat = "R2"
forsuggest_size()
), for example. According to @avehtari, we should go for LOO - $R^2$.There is also related code at https://github.com/stan-dev/projpred/blob/bec6258478ce9a04e92d50a0aa6628c23878dab5/R/summary_funs.R#L170-L187 (Note that
* (n / (n - 1)
can be omitted because it cancels out.) In those lines,bayesboot::rudirichlet()
is used. According to @avehtari, the SE could also be calculated without a Dirichlet approach, using the formula from https://github.com/stan-dev/loo/pull/205#issuecomment-1316683962.