haitian-sun / GraftNet

BSD 2-Clause "Simplified" License
268 stars 56 forks source link

Mapping from KB entity id to the text #7

Closed shmsw25 closed 5 years ago

shmsw25 commented 5 years ago

Hello, thanks a lot for releasing the code for the paper. It is very helpful. I am wondering where can I find the mapping between the kb entity (e.g. "m.16jpgj") to a real text. I believe it is needed to obtain glove embeddings of KB entities, which are part of the released preprocessed data, but I don't see that code in preprocessing folder.

I have looked at several different sources and it looks like people have been using the Freebase from FastRDFStore Package. I have downloaded Freebase dump from this repo, but it looks like 17% of KB entities in your preprocessed data are missing.

I would really appreciate if you can provide a pointer to the public data with the mapping, or release that mapping for WebQSP. Thanks!

haitian-sun commented 5 years ago

The preprocessing details are here. http://curtis.ml.cmu.edu/kbir/

Thanks, Haitian

shmsw25 commented 5 years ago

Hi @OceanskySun, thank you for your answer, but I don't think the textual forms of entities are in the preprocessing data. I see dictionaries with key entity_id and text, but text is always the same as entity_id. So textual forms of entities are still missing.

haitian-sun commented 5 years ago

Okay. Can you try this one? http://curtis.ml.cmu.edu/datasets/graftnet/entity_names.txt