openai / deeptype

Code for the paper "DeepType: Multilingual Entity Linking by Neural Type System Evolution"
https://arxiv.org/abs/1802.01021
Other
647 stars 147 forks source link

How to retrieve the wikidata Q-ID of an item using the marisa trie and the offset/value numpy arrays? #40

Closed heisenbugfix closed 6 years ago

heisenbugfix commented 6 years ago

The pre-processed data output consists of trie and bunch of numpy arrays containg values and offset.

I get a number 592252 which is not a Q-ID in wikidata for 'human'. I was trying to play around with the offset and value numpy arrays to retrieve the Q-ID but wasn't able to. Please let me know how to do the above. @JonathanRaiman (Sorry for bugging you again :) )

heisenbugfix commented 6 years ago

I got the solution. The indices obtained from those arrays are mappings to the line number in wikidata_ids.txt.