Pipeline NER and ED modules

Hello,

I was recently trying an Entity Linking pipeline for CoNLL2003 dataset using LUKE by executing the NER module to get the entity spans and then running the ED module, so I could get the reference to the Wikipedia link. Basically, I want to use the results from LUKE's NER model in the ED module somehow.

However, I cannot understand how the dataset file used in ED example page (https://github.com/studio-ousia/luke/tree/master/examples/entity_disambiguation) was created, more specifically, how the candidates were generated. I don't see any code designed for that in the repository.

Is there some step that I am missing here? Or do I need some external code to generate the candidates for ED?

studio-ousia / luke

Pipeline NER and ED modules #158