I currently have a number of the data fillers remove a column or row if they contain fewer than some n number of observations. Looking at how I want to do the strategy class, I think this is a poor decision. It would be better to take some ratio of missing data to nonmissing data, e.g. remove any columns which are missing more than 50% of their values. This should be more flexible in the long run and require less knowledge about the datasets on behalf of the user.
I currently have a number of the data fillers remove a column or row if they contain fewer than some n number of observations. Looking at how I want to do the strategy class, I think this is a poor decision. It would be better to take some ratio of missing data to nonmissing data, e.g. remove any columns which are missing more than 50% of their values. This should be more flexible in the long run and require less knowledge about the datasets on behalf of the user.