Closed ecsalomon closed 5 years ago
I'm guessing we should name the _imp column similarly but without the aggregate function? e.g.
zip_code_features_zip_code_1year_num_events_min_imp
zip_code_features_zip_code_1year_num_events_max_imp
become
zip_code_features_zip_code_1year_num_events_imp
?
The imputations for a categorical or quantity will be the same for the same aggregation period, regardless of aggregation function. This produces a lot of redundant columns. For example, the following features will have exactly the same imputation flag columns:
Collate should add only one imputation column per quantity/categorical per aggregation period.