haseebs / OWE

Pytorch code for An Open-World Extension to Knowledge Graph Completion Models (AAAI 2019)
https://aaai.org/ojs/index.php/AAAI/article/view/4162
37 stars 9 forks source link

entity2wikidata.json file queries #7

Closed mdabedr closed 3 years ago

mdabedr commented 3 years ago

Hi, I was wondering about the entity2wikidata.json file. If I were to use wikidata as the Knowledge Graph, how would this file look like? Also, are there any scripts that can be used to generate this file?

haseebs commented 3 years ago

entity2wikidata.json file would have the same structure regardless of the knowledge graph it is based on. For wikidata, you could use the identifier from the URL as the unique key for an entity (https://www.wikidata.org/wiki/<identifier>), and populate its label and description using the other appropriate fields.

mdabedr commented 3 years ago

Thank you for your reply. Do you always assume that a Wikipedia entry exists for an entity? Or do some of them have blank descriptions? Also, the current wikipedia2vec binary embeddings are all shared as pkl files which upon attempting to load gives the following error: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte

mdabedr commented 3 years ago

How did you fetch the summaries for an entity that has a Wikipedia page? Does your work assume that every entity has a Wikipedia page?

haseebs commented 3 years ago

Not all entities have a description. In this case, we only use the words in the entity's name.

Also, the current wikipedia2vec binary embeddings are all shared as pkl files which upon attempting to load gives the following error: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte

Do you still have this problem? If so, try using python v3.8.3 and loading with "rb" param in pickle. The pkl files work for me.