Table 4 experiments - Githubissues

posenhuang commented 6 years ago

Hi @rajarshd, @shehzaadzd,

For all the baseline methods, do you train on train.txt or on both train.txt and graph.txt? Do you have setting to reproduce the results? (e.g., Neural LP result is different from the ones in their paper for FB15k-237)

Thanks!

todpole3 commented 6 years ago

@posenhuang I think the biggest difference between the results in table 3&4 and the previously published results is that most of the previous evaluations have taken both subject and object rankings into account, while MINERVA only evaluated on object rankings.

This seems fine to me, but also explains why they have to re-evaluate most of the existing models themselves.

posenhuang commented 6 years ago

Thanks @todpole3 for the answer. Do you know how I can change to run both subject and object rankings? Is it to have both directions in train.txt, test.txt?

shehzaadzd commented 6 years ago

Hi Po-Sen, Given an (e1, r), MINERVA predicts a possible e2. Most embedding models calculate a score for a triple (e1, r, e2). We compare these methods only for the task for tail prediction, i.e., where we predict the e2 given (e1, r). The baselines, DistMult, Complex, ConvE, et al, perform both head prediction (predict e1, given r, e2) and tail prediction and average the scores. So we re ran their experiments for only their tail prediction scores. A simple hack to run MINERVA for head prediction would be just reorder the columns in the dataset. The current data format is e1, r, e2. If reordered to e2, r, e1, MINERVA would perform head prediction. I hope this helps! -S

posenhuang commented 6 years ago

Hi @shehzaadzd, in previous baselines, they train on both head predictions and tail predictions. When they run evaluations, they average both results on test set. To have a fair comparison, should I train on graph.txt (containing both directions) and test on head predictions and tail predictions (then average the results)?

rajarshd commented 6 years ago

Hi @posenhuang,

yes as @todpole3 pointed out, MINERVA starts from the entity in the question and searches to find the correct answer. This is akin to doing tail prediction. Yes, previous word report avg. of head and tail prediction. But the results we report in our paper are results of tail prediction and hence they are comparable. Would that work for you?

Also, I am not sure, sometimes head prediction makes complete sense. For example, a query of the form, (person X, lives_in, city Y) wouldn't make sense if it is inverted. Essentially what I mean is I am not sure the notion of finding a "reasoning path" in the inverted triple holds anymore. Would like to know what you think?

Regarding your last point, if training on the graph.txt (which contains edges in both directions) would make it comparable - I am not sure it would because of the above mentioned reason. Previous models define a score function between an entity pair and a relation. But MINERVA is kind of different that way. I am unsure if augmenting the training data with inverted edges will make the training easier and also comparable.

posenhuang commented 6 years ago

Thanks for answering the questions. I think there are also 1-to-many relations in tail prediction (such as company-> employ -> person) though.

shivam1702 commented 4 years ago

The current data format is e1, r, e2. If reordered to e2, r, e1, MINERVA would perform head prediction.

You mean e2, r^-1, e1?

shehzaadzd / MINERVA

Table 4 experiments #12