NSAPH / airpred

5 stars 6 forks source link

Use H2O for imputation #38

Closed mbsabath closed 5 years ago

mbsabath commented 5 years ago

Imputation currently performed using haphazard mix of various models, switch over to an exclusively h2o based method

mbsabath commented 5 years ago

Added code to use a random forest for imputation of missing values in the training stage. Still need to add code that uses the saved models for imputation during the prediction stage and test the code.

mbsabath commented 5 years ago

Added + prediction + put h2o imputation in main workflow

mbsabath commented 5 years ago

Added ability to use yaml files to define which variables are imputed + which inputs are used. Have a test of the code currently submitted on odyssey

mbsabath commented 5 years ago

Test of initial imputation ran successfully, testing use of saved models now. If successful, going to merge the imputation update in to master, adjust workflows in general data cleaning functions, and close the issue

mbsabath commented 5 years ago

Imputations generated in the training and prediction stage match each other. Updated code merged in to test and master. Closing the issue