DattaHub / horseshoe-review

Review paper for horseshoe prior and Lasso
6 stars 2 forks source link

AUC curve for Logistic regression (HS / Lasso) #5

Open DattaHub opened 6 years ago

DattaHub commented 6 years ago

We need to compare HS and Lasso for Logistic classification. It might be worthwhile to compare both run-time and accuracy since Lasso runs faster. @brandonwillard What would be a good simulation set-up? Or should we simply use a real-world data, e.g. the Leukemia or Pima Indian Diabetes data?

brandonwillard commented 6 years ago

Well, I imagine we would have to do simulation for testing at least, but I'm not sure if any such simulations would be convincing/publication worthy. Outside of a simulation that so clearly demonstrates particular distinctions between Lasso and Horseshoe in (hopefully favorable) AUC properties -- and I can't think of any right away -- the real-world data sets are always worth a try.

Otherwise, is there any justifiable way to convert our previously published examples (e.g. prostate data) into proper [logit-like] classification problems and expect to see the same, or similar, advantages/distinctions?

brandonwillard commented 6 years ago

On a somewhat related note, I'm not too keen on AUC/ROC-type measures. Are we required to use it?

DattaHub commented 6 years ago

Not really. I think the goal is to demonstrate how Horseshoe performs for a non-Gaussian likelihood. Logistic would be a good example. We can also look at mean log predictive density like Piironen & Vehtari (2017) https://projecteuclid.org/download/pdfview_1/euclid.ejs/1513306866.

brandonwillard commented 6 years ago

Yeah, I'm much more inclined to work with posterior predictive distributions directly, when possible, but I'm not sure how best to bring Lasso into that picture.