MetOffice / XBTs_classification

Project for the classification of eXpendable Bathy Thermographs
BSD 3-Clause "New" or "Revised" License
4 stars 2 forks source link

Use scikit-learn pipeline in XBT code. #55

Open stevehadd opened 3 years ago

stevehadd commented 3 years ago

Currently we are manually putting together the pipeline for processing XBT data. Now that the desired pipeline has been decided and described (by the code), it would be good to implement this properly using the scikit-learn pipeline class. As there is some custom processing going on, this will probably involved

https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html https://scikit-learn.org/stable/modules/compose.html https://scikit-learn.org/stable/developers/develop.html?highlight=baseestimator

With this we could encapsulate each step of processing whether custom or using standard scikit-learn object, into a pipeline, which can then be used to feed in to a voting classifier. Steps in the pipeline could include

stevehadd commented 3 years ago

This should make use of the voting classifier for ensembles, as described in #48 https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.VotingClassifier.html