Wikidata / soweego

Link Wikidata items to large catalogs
https://meta.wikimedia.org/wiki/Grants:Project/Hjfocs/soweego_2
GNU General Public License v3.0
95 stars 8 forks source link

Super-confident predictions #305

Closed marfox closed 4 years ago

marfox commented 5 years ago

Different classifiers may capture different relations in the data. We can join what each classifier learns by creating ensembles of them.

Results will then be presented in a separate task

tupini07 commented 5 years ago

It seems that ensembles are commonly used in RL pipelines to predict if 2 entities match or not. Ensembles are usually composed of diverse models (eg, SVM + Decision Trees). A couple of examples:

In [1] a final ensemble is used to self learn on a partially labelled dataset. In [2] they use multiple classifiers to predict a match, the final classifier is selected depending on its score and how interpretable the model is.

  1. A novel ensemble learning approach to unsupervised record linkage (2017)
  2. Magellan: Toward Building Entity Matching Management Systems (2016)