nasa-petal / PeTaL-labeller

The PeTaL labeler labels journal articles with biomimicry functions.
https://petal-labeller.readthedocs.io/en/latest/
The Unlicense
6 stars 3 forks source link

Use a large ensemble of single-label classifiers (so, treat all of the labels independently, ignore the hierarchy, and we have a hundred separate yes/no tasks) and see if this works better than MATCH #74

Open bruffridge opened 3 years ago

bruffridge commented 3 years ago

We consider using ensemble methods to improve our performance on the precision and recall of our machine learning component. The impetus behind ensemble methods is that of drawing from the wisdom of crowds, that is, crowds of classifiers. Many different classifiers are trained to detect different signals in the same data. Their verdicts are subsequently aggregated through various voting schemes and policies into an ensemble prediction. One advantage of ensemble methods is that they do not require each of their component classifiers to be accurate predictors. In fact, an ensemble can learn which of its component classifiers are more reliable and assign them more weight. These weights would be learned in a similar manner to any other parameters in a machine learning model. For PeTaL, it may be fruitful to explore using an ensemble of single-label classifiers. Each classifier would specialize in predicting a certain biomimicry function, although each biomimicry function may have multiple classifiers assigned to it.