facebookresearch / kbc

Tools for state of the art Knowledge Base Completion.
Other
254 stars 37 forks source link

Triple Indices to Triple Labels #17

Closed gutihernandez closed 4 years ago

gutihernandez commented 4 years ago

Hi! I have a question about datasets. After calling python kbc/process_datasets.py and downloading the datasets, I realized that triples are in the "index" format as such: a triple --> 2431 89 5452.

Where can I find a mapping which maps each of the triple indices that this repository uses, into their true labels? (e.g. --> /m/07l450 /film/film/genre /m/082gq)

timlacroix commented 4 years ago

Hi,

yeah sorry I uploaded these datasets at a time where the original datasets were unavailable, so all I had was my processing of these datasets. Since I didn't need the original mappings, these datasets were enough for me.

You can download the original train / valid / test at this url https://everest.hds.utc.fr/lib/exe/fetch.php?media=en:fb15k.tgz

and put them in the src_data/FB15K folder before running process_datasets.

This will yield the correct rel_id / ent_id files.

If you've already run process_datasets before, you'll have to remove the folder pkg_resources.resource_filename('kbc', 'data/FB15K') in order to force re-processing the dataset (or just use another name)

gutihernandez commented 4 years ago

Perfect! Thank you for quickly responding :)

Could you also share the:

's mappings as well please? I tried to use the same link by changing fb15k.tgz into wn18.tgz but apparently file is name something else or the source of the file is different.

timlacroix commented 4 years ago

wordet: https://everest.hds.utc.fr/lib/exe/fetch.php?media=en:wordnet-mlj12.tar.gz fb237: https://www.microsoft.com/en-us/download/details.aspx?id=52312 wn18rr: https://github.com/TimDettmers/ConvE/blob/master/WN18RR.tar.gz yago3-10: https://github.com/TimDettmers/ConvE/blob/master/YAGO3-10.tar.gz

gutihernandez commented 4 years ago

Thank you very much @timlacroix!