Use scikit-learn pipeline in XBT code.

Currently we are manually putting together the pipeline for processing XBT data. Now that the desired pipeline has been decided and described (by the code), it would be good to implement this properly using the scikit-learn pipeline class. As there is some custom processing going on, this will probably involved

https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html https://scikit-learn.org/stable/modules/compose.html https://scikit-learn.org/stable/developers/develop.html?highlight=baseestimator

With this we could encapsulate each step of processing whether custom or using standard scikit-learn object, into a pipeline, which can then be used to feed in to a voting classifier. Steps in the pipeline could include

select a subset of features (custom)
select a subset and splits of data (custom)
hyperparameter tuning (grid search or random) (standard scikit-learn)
cross validation (outer and inner) (standard scikit-learn, with custom folds)
calculate metrics (for score function) (custom classes using standard classes)

MetOffice / XBTs_classification

Use scikit-learn pipeline in XBT code. #55