Closed bcollica closed 3 years ago
I'm going to write out a few comments and questions as I go through the review. Some are targeted at specific sections of code, others are more general (listed here). No need to address these right away, we should meet to discuss them first.
cvCovEst
returns tibbles, would it be better to have the functions relating to plotting and summary also work with tibbles instead of data.frames/data.tables?"
and '
. It's typically best practice to use "
. I've tried to fix all quotes throughout, but probably missed a few. Sorry for being nitpicky!multiHyperRisk
). Let's see if we can't reduce the amount of repetition so that maintenance is smoother.Hey all, these are the changes I've implemented so far:
Things not yet changed:
Let me know what you guys think about data for examples and anything else that comes to mind.
Brian
This is great Brian, thank you so much for all your work! If you're alright with it, I can take a stab at the first two remaining points on your to-do list this weekend. We can then meet this week to discuss the implementation of examples, and go over any remaining questions that you have. And that sounds like a good strategy regarding styler
!
Sounds good, Phil. The main issue with the multiHyperrisk function is passing arguments to filter, factorize, and arrange the data as needed depending on which estimator(s) are specified. If that information can be stored at a higher level in the function and then successfully passed down to the other functions like ggplot then the code can be much shorter. I know these tidyverse functions can be difficult to work around with their lazy evaluation, so if you have any thoughts I'm definitely open to suggestions.
I've taken a closer look at the plotting functions' code (but haven't touched anything), and I think that the first step in increasing its maintainability might be to wrap estimator-related code in standalone functions. These functions can then be called by the various summary and plotting functions as needed. We can then refactor these functions at some point in the future to take advantage of rlang
functionality.
Here are two suggestions:
multiHyperRisk
function, you might consider moving the code inside the if-else statements to estimator-specific functions that are then called by multiHyperRisk
. This organization will cut multiHyperRisk
's length in half, and make it easier to extend with multi-parameter estimators implemented in the future. Perhaps these functions could be store in estimator-specific helper files?plot.cvCovEst
, multiHyperRisk
, cvRiskPlot
, cvSummaryPlot
, and hyperRisk
rely on some hard-coded estimator categorizations. I suggest writing an internal helper function that keeps track of the estimator characteristics, and then returns them as necessary. Maybe as a named list? These characteristics will obviously be hard-coded, but at least they'll only need to be hard-coded once in an easy-to-maintain function.I'm sure that you can think of more examples, but these suggestions should be relatively painless to implement, and address the most challenging tasks remaining in this PR.
Let me know what you think. @nhejazi, your thoughts on this would also be much appreciated. I don't want to lead Brian down the wrong path.
Again, thank you for all your work on this!
Just pushed those changes including separate plotting functions that are specific to estimators with multiple hyperparameters. Also included a estAttributes function to cut down hard coding.
This looks great! Thanks for making fast work of these requests.
LGTM! There's a few minor comments to address/discuss, but I'm good with merging after we've re-synced this with master
and resolved those comments. This will be great functionality to have! I didn't see anything about it in the vignette, but we should eventually add a section there about this.
Updating pull request to merge plot_summary branch into master. Tests have been added, and descriptions have been updated. One final thing will be to add a 'Details' section in both the main plot and summary functions, but this can come later, along with examples.