Classifier: remove invariant columns

It is common for the mlbackend to receive training data with columns that do not vary across the rows. Usually they are all zero, in which case they have no effect on training, but they could be (say) all one, in which case the column becomes a duplicate bias vector.

In training this is a waste of resources, but the real trouble comes with prediction. If a row has a different value in that column, its effect is entirely random because the column is entirely untrained.

How can this happen? Well suppose the column is called 'course_X' and the training data is from last year. Course X was not offered last year for reasons, but this year it is.

The solution is to ignore all columns with no variation, and remember which columns they were. This makes training faster and prediction better.

To save the variable columns we need to transfer the indexes to and from the TF object, because that is what we save. Other than the minor hackiness involved there, it is all quite simple.

moodlehq / moodle-mlbackend-python

Classifier: remove invariant columns #23