CodeSpaceHQ / MENGEL

A framework that applies machine learning algorithms and automates the process of finding the right algorithm for the job.
6 stars 1 forks source link

Data Filler Refactoring #118

Closed ZakeryFyke closed 8 years ago

ZakeryFyke commented 8 years ago

I currently have a number of the data fillers remove a column or row if they contain fewer than some n number of observations. Looking at how I want to do the strategy class, I think this is a poor decision. It would be better to take some ratio of missing data to nonmissing data, e.g. remove any columns which are missing more than 50% of their values. This should be more flexible in the long run and require less knowledge about the datasets on behalf of the user.