Landmarkers, cross validation, and seeds

All the landmarking metafeatures use cross validation which requires a seed to generate random folds. Each landmarker uses its seed for the cross validation. This is a problem for two reasons.

The seed for cross validation should be different that the seed for the landmarker classifier. We do not know what kinds of biases might be created by sharing a seed.
All of the landmarkers should share the same cross validation splits, i.e. use the same seed for cross validation. This would make the landmarkers more comparable within a single dataset.

A solution would be to create a special seed just for the cross validation. Then, any metafeature (e.g. landmarkers) which use cross validation will get the same cross validation seed and thus the same folds. Futhermore, this would require that any metafeatures which require a seed and cross validation would now get two seeds: the seed for the metafeature and the seed for cross validation.

byu-dml / metalearn

Landmarkers, cross validation, and seeds #124