Add best prediction to classification output and metrics

MetOffice / XBTs_classification

Project for the classification of eXpendable Bathy Thermographs

BSD 3-Clause "New" or "Revised" License

4 stars 2 forks source link

Add best prediction to classification output and metrics #48

Closed stevehadd closed 3 years ago

stevehadd commented 4 years ago

Currently, when we produce an ensemble of classifiers, we produce a prediction for each, and combine them into a probability, but we should produce a best result. There are 2 ways to do this

highest probability results
the result from classifier that overall scores best

We should focus on the max probability class, then calculate the relevant metrics (recall, precision and F1) for the highest prob output and compare performance to individual classifiers.

stevehadd commented 3 years ago

We should update code to use the voting classifier https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.VotingClassifier.html

stevehadd commented 3 years ago

It looks the voting classifier above is slightly different to what I expected. You can't give it already trained classifier to calculate the vote. You have to give it the classifier objects then call fit. This means all the classifiers are being trained on the same data, which is different to what we are doing currently, which is producing an ensemble. For now we'll stick with the custom implementation and defer proper integration of the code into the scikit-learn framework to issue #55