Open solis9753 opened 2 years ago
Yeah - that requires applying the feature engineering during the covariate summary. I skipped it due to time it takes to run, but thanks to Egill edits to use arrow that should be a lot faster now, so lets add in the ability to run the feature engineering into the start of covariateSummary(). I'll mark this bug and moderate complexity.
Is your feature request related to a problem? Please describe. The
covariateSummary()
function takes as input the originalplpData$covariateData
object and as a result a covariate summary is given only on features that are provided from the start. This means that features generated within featureEngineering are not included in the covariate summary whenrunPlp()
ends. See this part of the code from lines 449 to 458.Describe the solution you'd like Include covariate summary for features created within
featureEngineer()
Describe alternatives you've considered Either merge data$Train$covariatedData and data$Test$covariateData prior to calling
covariateSummary()
or a separate set of code for feature engineered settings with a flag for returning covariate summary for those. I see there is a flag existing already for feature engineered withincovariateSummary()
but not used. I am sure there must be better ways.Additional context Requesting this, I know also that
covariateSummary()
can take a long time to complete if there are a lot of features, so I am wondering what will happen in my case when I create 100's thousands of covariates. But I would like to have a summary for those covariates also. Maybe a summary only for the features that are selected in the final model?