ContinuumIO / elm

Phase I & part of Phase II of NASA SBIR - Parallel Machine Learning on Satellite Data
http://ensemble-learning-models.readthedocs.io
43 stars 27 forks source link

Ensure steps.DropNaRows() drops NA y when dropping rows from X #170

Closed PeterDSteinberg closed 7 years ago

PeterDSteinberg commented 7 years ago

A bug in elm.pipeline.steps resulted in DropNaRows(X, y) dropping NaN rows from the X matrix without dropping the corresponding rows from y. The outcome was a dimension error on fitting. I did not add a test for this fix. I'll make a separate issue to better test:

I am deferring the testing on those items above, rather than adding them to this PR, because we are about to start a 4 to 8 week major refactor of elm.sample_util, moving the Pipeline of filter functions like DropNaRows to earthio.filters (a new subpackage). See also #149 for details on that refactor.