refactored eval_model - Githubissues

Chicago / food-inspections-evaluation

This repository contains the code to generate predictions of critical violations at food establishments in Chicago. It also contains the results of an evaluation of the effectiveness of those predictions.

http://chicago.github.io/food-inspections-evaluation/

Other

411 stars 130 forks source link

refactored eval_model #92

Closed geneorama closed 8 years ago

geneorama commented 8 years ago

We've accepted the pull request from @cash that does the model evaluation a lot more directly than the previously. However, I'd like to streamline it a bit more, and make the functions even more general for use elsewhere. So, I've broken eval_model into two functions called simulated_date_diff_mean and simulated_bin_summary

I'm looking for comments in general, but specifically on the issues that

There are now two functions (which is fine) with identical code (not usually fine). Maybe it's ok here? Maybe there needs to be a third function? Maybe the two need to be merged back into one?
These two functions have obnoxious names, and simulated_bin_summary has obnoxious column titles. Any ideas for better names / labels?

Feel free to comment on other problems, I'm just mentioning two things that are already on my mind.

Thanks!

Gene

Also tagging: @tomschenkjr @cash and @fgregg (if you have time / interest)

geneorama commented 8 years ago

BTW, to see what I mean, checkout branch iss91, and run 30_glmnet_model.R to line 146, then step through the code.

cash commented 8 years ago

@geneorama there is only two lines of shared code between the two functions and that code is not worth pulling out into a separate function.

I have no other comments. I looked over the code, but did not run it.

geneorama commented 8 years ago

@cash good idea on the name change. After some more research I think "labels" might make even more sense, but positives is an improvement over pos.

Do you have any ideas for the other output? Not sure how to make this summary more intuitive:

Some places I looked for inspiration:

ROCR Package: https://cran.r-project.org/web/packages/ROCR/index.html
Blogpost about ROCR: https://hopstat.wordpress.com/2014/12/19/a-small-introduction-to-the-rocr-package/
Confused person asking about ROC measures in R: http://stackoverflow.com/questions/11467855/roc-curve-in-r-using-rocr-package

geneorama commented 8 years ago

BTW, it's fun to play around with some metrics from the ROCR package.

##==============================================================================
## Metrics with ROCR Package
##==============================================================================
## computing a simple ROC curve (x-axis: fpr, y-axis: tpr)
geneorama::loadinstall_libraries("ROCR")

predTest <- prediction(datTest$score, datTest$criticalFound)

## precision / recall
plot(performance(predTest, "prec", "rec"), main="precision recall")

# ROC
plot(performance(predTest, "tpr", "fpr"), main="ROC")
abline(0, 1, lty=2)

## sensitivity / specificity
plot(performance(predTest, "sens", "spec"), main="sensitivity vs specificity")
abline(1, -1, lty=2)

## phi
plot(performance(predTest, "phi"), main="phi scores")

## Fancy ROC curve:
op <- par(bg="lightgray", mai=c(1.2,1.5,1,1))
plot(performance(predTest,"tpr","fpr"), 
     main="ROC Curve", colorize=TRUE, lwd=10)
par(op)

## Effect of using a cost function on cutoffs
plot(performance(predTest, "cost", cost.fp = 1, cost.fn = 1), 
     main="Even costs (FP=1 TN=1)")
plot(performance(predTest, "cost", cost.fp = 1, cost.fn = 4), 
     main="Higher cost for FN (FP=1 TN=4)")

## Accuracy
plot(performance(predTest, measure = "acc"))

## AUC
performance(predTest, measure = "auc")@y.values[[1]]

cash commented 8 years ago

I'm coming from a machine learning background, so ROC curves and AUC are what I'm used to looking at. That fancy ROC curve did a really nice job of showing the distribution of the scores. I hadn't noticed the skew in that distribution before.

This PR is enough of an improvement over what's in master that I recommend doing any clean up that is needed (variable names, typos like "Caluclate") and then merge it in. Adding some nice plots could be an additional pull request.

geneorama commented 8 years ago

@cash I was really hoping to get some feedback on the column headers, like POSTOT_SIM

The name components are supposed to signal the following:

component	meaning
POS	Positives
TOT	Running total
SIM	simulated results.

but, I think the current names are unintuitive and ugly, so any suggestions are welcome.