Closed patrick-miller closed 7 years ago
Nice work. These look really good based on my first look at them. A few quick comments:
Should you also revise the utils.py file to remove the functions that we no longer need?
I really like the revision to the probabilities plots at the end of the notebook (showing the plots for all three models). I think labeling the axis would also be very helpful.
I translated one of my R ggplot themes to plotnine
in the utils.py file. We can play around with this and the color scheme for the plots. The ROC plot might be a little busy.
Would it be possible to show the actual AUROC values on the AUROC plot? Maybe next to the lines on the legend? If not maybe we can insert a table that shows this. I think this is one of the more important pieces of information in the notebook so it would be good to be able to easily see the testing and training AUROC for each model.
@rdvelazquez I gave you write access to this repo. After all your suggestions have been addressed and @patrick-miller is done, you can "Squash and merge" this pull request. Make sure to select "Squash and merge" rather than "Create a merge commit"
I agree @rdvelazquez about the AUROC values. I include them in the next few cells, but it would be better to have them inside the plot itself. Because there are 6 values, things can get cluttered. Maybe we only include the test AUROC inside the plot?
I'm fine with the current display of AUROCs (in the following cell). I don't think there is really a clean way to include them in the plot.
@patrick-miller merge when you're ready. So ml-workers should be able now to directly use this notebook?
@dcgoss how should we go about using this notebook in production? Can ml-workers just use the URL for this notebook once it's merged?
how should we go about using this notebook in production? Can ml-workers just use the URL for this notebook once it's merged?
We are planning to use the machine-learning repo for development of the notebooks but ml-workers for production so the next step (either now or once all the items in #110 are done) will be to open a PR in ml-workers updating that notebook. @patrick-miller 's revisions in #111 were designed to let this notebook work in both the machine-learning and the ml-workers repos.
I'm not opposed to just having ml-workers reference this notebook but I think when we discussed it at the last meetup there were some potential issues with doing that.
Here I am replacing all of the plots with the ggplot inspired
plotnine
package. I think that these look good to me, but let me know what you think.