blind-contours / CVtreeMLE

:deciduous_tree: :dart: Cross Validated Decision Trees with Targeted Maximum Likelihood Estimation
MIT License
5 stars 1 forks source link

Comments on the paper #6

Closed GaryBAYLOR closed 2 years ago

GaryBAYLOR commented 2 years ago

above V-fold cross-validation line 24-25: My understanding about cross-validation is we partition the whole data into V folds, and each time use data in V-1 folds to build a model (parameter generating sample), and apply the built model on the fold that is left out for scoring/validation (estimation sample). Is this how the package works? If so I think the following sentence needs to be rephrased as it means after we partition the data into K folds, for each fold we continue to partition the data into two parts.

CVtreeMLE uses V-fold cross-validation and partitions the full data
in each fold into a parameter-generating sample and an estimation sample

Here fold means group or partition. This page has a good explanation of cross-validation https://machinelearningmastery.com/k-fold-cross-validation/

The background section line 58-59: As the author explained that in a lot of cases researchers are interested in studying a priori specified treatment or exposure, it is better to include a few typical such research examples/papers for readers to understand the context better.

line 61: more explanation is needed for what specifically high-dimensionality and sparsity refers to. It is the huge number of exposures compared to number of data points collected that makes the data high-dimensional? What is the sparsity about?

line 62-62: References are needed to support the following statement, so readers can understand why and how a target parameter is ill-defined, and therefore understand more about what this package wants to improve.

Even if this approach were possible, a target parameter that can inform public policy 
is still ill-defined
blind-contours commented 2 years ago

I have added additional sentences to clarify V-fold cross validation (in the targeted learning literature we call this process V-fold CV for data-adaptive target parameters).

I've changed the background section to be more intuitive with less jargon. The general issue is that in most cases we have one treatment/exposure so we know what the exposure is. In the case for mixtures, we need to define exposure given every combination of exposures in the mixture is not available. I've just detailed this in words rather than saying "high-dimensional" or "sparsity". Thank you for helping clarify this.