paul-english / predictive_imputer

Predictive imputation of missing values with sklearn interface. This is a simple implementation of the idea presented in the MissForest R package.
MIT License
39 stars 14 forks source link

Accuracy Test #109

Open MicheleSergioPozzi opened 6 years ago

MicheleSergioPozzi commented 6 years ago

I have used your algorithm to successfully imput missing values into a deteset.. However, I have not been able to find any method to check the accuracy/success of the imputed values. Do you have any available methods/suggestions?

oattah1 commented 6 years ago

Hey MicheleSergioPozzi, I was wondering how did you impute missing values into a dataset, did you use the fit method and then the transform method? Also, for checking the accuracy which most people do I believe use fill the dataset with missing values with another imputation method, like kNN, and stored that now filled dataset. Then artificially removed values randomly (there are metabolomics paper that provide algorithms to do this), and then use this algorithm to impute the missing values and then use a method like NRSME to check how accurate the imputed dataset was to the filled dataset. Hope this helps!

oattah1 commented 6 years ago

Also there is a folder in the code called tests and in it there is a file test_predictive_imputer.py and I think that is supposed to test the algorithm