thunlp / KB2E

Knowledge Graph Embeddings including TransE, TransH, TransR and PTransE
MIT License
1.4k stars 450 forks source link

PTransE how are paths constructed and how can you re-create them? #63

Closed darvid7 closed 6 years ago

darvid7 commented 6 years ago

Hi! I ran PCRA.py and got this as part of the output of train_pra_sample.txt

/m/015qsq /m/02bjrlw 935
4 2 748 1327 0.375 2 748 59 0.1875 1 935 0.25 2 748 2525 0.1875

My understanding of this is that the has the entities /m/015qsq, /m/02bjrlw an a single relation id 935. The first item in the second line contains the number of relation paths joining the entities /m/015qsq & /m/02bjrlw which is 4.

The next sections hold the number of relations along a single relation path joining /m/015qsq and /m/02bjrlw, eg: 2 748 1327 0.375. This means that this relation path has two relations (748, 1327) and a confidence of 0.375.

This would mean that this path exists in the FB15K dataset

/m/015qsq -relation with id 748-> missing entity -relation with id 1327-> /m/02bjrlw/ which is the same as /m/015qsq -/film/film/country-> missing entity -/location/country/official_language-> /m/02bjrlw/

from relation2id.txt

/film/film/country  748
/location/country/official_language 1327

I am trying to construct the path, or find the missing entity. However, when investigating the test.txt, train.txt and valid.txt triple files I could not find a triple which had

/m/015qsq missing entity /film/film/country and missing entity /m/02bjrlw/ /location/country/official_language

Note: in test.txt, train.txt and valid.txt the format is head \t tail \t relation.

I searched using this command grep -e "/m/015qsq\t/m/.*\t/film/film/language" test.txt train.txt valid.txt which outputs the following

train.txt:/m/015qsq /m/02bjrlw  /film/film/language
train.txt:/m/015qsq /m/02h40lc  /film/film/language
train.txt:/m/015qsq /m/06nm1    /film/film/language

so our candidates for the missing entity are /m/02bjrlw, /m/02h40lc and /m/06nm1.

to find which one is the bridging entity I ran the following but there were no matches.

grep -e "/m/02bjrlw\t/m/02bjrlw\t/location/country/official_language" test.txt train.txt valid.txt
grep -e "/m/02bjrlw\t/m/02h40lc\t/location/country/official_language" test.txt train.txt valid.txt
grep -e "/m/02bjrlw\t/m/06nm1\t/location/country/official_language" test.txt train.txt valid.txt

Am I missing something? How can I construct the paths based on the entities and relation paths produced by PCRA?

Thanks for your time.

Mrlyk423 commented 6 years ago

You should use grep -e "/m/015qsq\t/m/.\t/film/film/country" not "grep -e "/m/015qsq\t/m/.\t/film/film/language" " . And the path (748, 1327) is in train.txt as " /m/015qsq-->/film/film/country-->/m/03rjj-->/location/country/official_language-->/m/02bjrlw"

darvid7 commented 6 years ago

Ah thank you so much! Thank you for your time! silly mistake by me.