Closed mbsabath closed 5 years ago
Added code to use a random forest for imputation of missing values in the training stage. Still need to add code that uses the saved models for imputation during the prediction stage and test the code.
Added + prediction + put h2o imputation in main workflow
Added ability to use yaml files to define which variables are imputed + which inputs are used. Have a test of the code currently submitted on odyssey
Test of initial imputation ran successfully, testing use of saved models now. If successful, going to merge the imputation update in to master, adjust workflows in general data cleaning functions, and close the issue
Imputations generated in the training and prediction stage match each other. Updated code merged in to test and master. Closing the issue
Imputation currently performed using haphazard mix of various models, switch over to an exclusively h2o based method