The iMeta algorithm for determing the model and manufacturer for a given observation can be described as a decision tree using country, max depth and date of observation. Decisions trees are a class of classification algorithm in ML, so an obvious first step beyond the iMeta algorithm is to see whether an ML decision tree algorithm can do better being trained on the XBT data vailable, than the iMeta algorithm developed through human analysis of that data.
things to try:
We should try some of those available in SK-learn with the same features as iMeta: country, year and max_depth.
We should try the current state of the art in DT algorithms, XGBoost, which needs a separate library.
We should also then test whether adding other featuress (cruise ID, institute, temp quality flag) improves performance.
The iMeta algorithm for determing the model and manufacturer for a given observation can be described as a decision tree using country, max depth and date of observation. Decisions trees are a class of classification algorithm in ML, so an obvious first step beyond the iMeta algorithm is to see whether an ML decision tree algorithm can do better being trained on the XBT data vailable, than the iMeta algorithm developed through human analysis of that data. things to try: