thunlp / Fast-TransX

An Efficient implementation of TransE and its extended models for Knowledge Representation Learning
MIT License
402 stars 108 forks source link

Validation file #40

Closed niladri-paul closed 6 years ago

niladri-paul commented 6 years ago

I am confused on how to generate a valid file from a given training and testing file. Is there any documentation or code for that? Is test_transR.cpp the code which can predict missing entity in a triplet ?

ProKil commented 6 years ago
  1. To generate a valid file, you can randomly divide training data into a training part and validation part. We commonly tune parameters with k-fold validation or just do random divisions.

  2. Yes.

-Hao

niladri-paul commented 6 years ago

Dear Hao, thanks for the comment. I am still confused with part 2 of my question. the test_transR.cpp file is running fine and giving me some output which looks like to be mean-rank. What I want is the following. I have a set of triplets in my tes2id.txt file. Now I just want to know which of them are most probable or `valid' triplet and which one are not. Kindly let me know how to achieve this using test_transR.cpp

THUCSTHanxu13 commented 6 years ago

You can use the following function to compute scores for your triples. The score function is |h+r-t|, you can use these scores to know which of them are most probable or valid.

float calc_sum(long e1, long e2, long rel) { float res = 0; long last1 = e1 relationTotal dimensionR + rel dimensionR; long last2 = e2 relationTotal dimensionR + rel dimensionR; long lastr = rel * dimensionR; for (long i = 0; i < dimensionR; i++) res += fabs(entityRelVec[last1 + i] + relationVec[lastr + i] - entityRelVec[last2 + i]); return res; }

THUCSTHanxu13 commented 6 years ago

This function has been included in the test code.