dalab / pboh-entity-linking

Source code for the paper "Probabilistic Bag-Of-Hyperlinks Model for Entity Linking" , http://dl.acm.org/citation.cfm?id=2882988
58 stars 15 forks source link

pboh-entity-linking

PBoH Entity Linking system.

Code: beta version

Paper: "Probabilistic Bag-Of-Hyperlinks Model for Entity Linking" , Ganea O-E et al. , (proc. WWW 2016), http://dl.acm.org/citation.cfm?id=2882988

Slides, poster, online system and comparison with existing systems : http://people.inf.ethz.ch/ganeao

Newest GERBIL results:

Indexes download link: https://polybox.ethz.ch/index.php/s/IOWjGrU3mjyzDSV . They are required in various places (i.e. wherever there are file paths containing the prefix '/media/hofmann-scratch/'). The files whose names end in _part need to be concatenated in one big file without these suffixes before being used, e.g. one file called anchorsListFromEachWikiPage.txt_dev_index will be made by merging all files anchorsListFromEachWikiPage.txt_devindex.part. The provided indexes are already in the suitable format for indexes that are loaded in here: https://github.com/dalab/pboh-entity-linking/tree/master/src/main/scala/index . For the eval datasets , there are indications where to get the data from at the beginning of each file in here: https://github.com/dalab/pboh-entity-linking/tree/master/src/main/scala/eval/datasets For the AIDA dataset, a sample of the format is shown here: https://github.com/dalab/pboh-entity-linking/issues/3 Please contact octavian.ganea at inf dot ethz dot ch to receive the required password and for other questions you might have.

How to run the code: