Filtered metrics are all zero

CicoZhang commented 6 years ago

I am running the model using the data in this repo, the filtered metrics of the results I got are all zero, is that normal? Are there any mistakes I made in running the script? Thanks in advance.

The code to train the model

%run train.py --mode single \
--ent ../dat/wordnet-mlj12/ent.txt \
--rel ../dat/wordnet-mlj12/rel.txt \
--train ../dat/wordnet-mlj12/wordnet-mlj12-train-corrected.txt \
--valid ../dat/wordnet-mlj12/wordnet-mlj12-valid-corrected.txt \
--method distmult --mode pairwise --epoch 300 --batch 10 --lr 0.1 \
--l2_reg 0.0001 --dim 100 --opt adagrad \
--log wordnet-distmult/

The code to run the model over the test dataset.

%run test.py --ent ../dat/wordnet-mlj12/ent.txt \
--rel ../dat/wordnet-mlj12/rel.txt \
--data ../dat/wordnet-mlj12/wordnet-mlj12-test-corrected.txt \
--method distmult \
--model "wordnet-distmult/DistMult.best"

The output:

loading model...
Hits@1              : 0.3976
Hits@1(filter)      : 0.0
Hits@10             : 0.8054
Hits@10(filter)     : 0.0
Hits@3              : 0.6436
Hits@3(filter)      : 0.0
MRR                 : 0.5419057892992648
MRR(filter)         : 0.0

mana-ysh commented 6 years ago

Please add the argument --graphall and --filtered for filtered evaluation at testing (sorry for not describing this).

%run test.py --ent ../dat/wordnet-mlj12/ent.txt \
--rel ../dat/wordnet-mlj12/rel.txt \
--data ../dat/wordnet-mlj12/wordnet-mlj12-test-corrected.txt \
--method distmult \
--model "wordnet-distmult/DistMult.best" \
--graphall ../dat/wordnet-mlj12/whole.txt \
--filtered

This command should work and whole.txt contains all triplets data (just concatenating train/dev/test files), so please replace with your own data.

CicoZhang commented 6 years ago

Thank you for pointing it out. It now works as expected!

zhouhoo commented 6 years ago

I trained on my own data, the hyper parameters are list as fellow: mode : single epoch : 500 batch : 128 lr : 0.05 dim : 200 negative : 10 l2_reg : 0.0001 gradclip : 5

my train data sample is like : 36401 AtLocation 17936 53973 AtLocation 117291 29419 AtLocation 78437 17776 AtLocation 133677 57301 AtLocation 14850

but the test result for hits are all zero . Hits@1 : 0 Hits@1(filter) : 0 Hits@10 : 0 Hits@10(filter) : 0 Hits@3 : 0 Hits@3(filter) : 0 MRR : 0.00111741447482 MRR(filter) : 0.00163314423242

I noticed that loading vocab is taking very long and when test loading whole graph also taking too long. but these really matter too all-zero problems? could you help me out? thanks

mana-ysh commented 6 years ago

Hi @zhouhoo

Firstly, thank you for pointing out long loading time. I will fix it in the future. Could you tell me the commands you ran for training and testing to answer the question?

zhouhoo commented 6 years ago

@mana-ysh sorry, I was too rush . I run as the readme mentioned.

for train(default hyper parameter are list as above and I did not feed into): python train.py --ent ../dat/iel/nodes_id --rel ../dat/iel/rels --train ../dat/iel/train --valid ../dat/iel/validation

for test , python test.py --ent ../dat/FB15k/train.entlist --rel ../dat/FB15k/train.rellist --data ../dat/iel/test --filtered --graphall ../dat/iel/whole --method complex --model 20171219_21\:29/ComplEx.best

mana-ysh commented 6 years ago

@zhouhoo

Thank you for showing! In testing, the arguments --ent and --rel must be the same entity/relation lists in training. So can you try this?

python test.py --ent ../dat/iel/nodes_id --rel ../dat/iel/rels --data ../dat/iel/test --filtered --graphall ../dat/iel/whole --method complex --model 20171219_21:29/ComplEx.best

zhouhoo commented 6 years ago

@mana-ysh oh, I post the wrong command. I train freebase and my own data both. I am sure the --ent and --rel in train are the same as in test. for train , python train.py --ent ../dat/iel/nodes_id --rel ../dat/iel/rels --train ../dat/iel/train --valid ../dat/iel/validation

for test, python test.py --ent ../dat/iel/nodes_id --rel ../dat/iel/rels --data ../dat/iel/validation --filtered --graphall ../dat/iel/whole --method complex --model 20171219_21\:29/ComplEx.best

mana-ysh commented 6 years ago

@zhouhoo

OK. One of the reasons for this may fail the optimization. Please try to use Adagrad as the optimizer. I'm sorry I forgot to write it in readme, but the results are too wrong even though the cause is this, I think. Can I check the output log file? I want to know the values of evaluation metric for validation data.

zhouhoo commented 6 years ago

@mana-ysh thanks, I will try Adagrad. but should not all be zeros ,right? log file sample like this: 2017-12-19 21:29:05,062 INFO Arguments... 2017-12-19 21:29:05,062 INFO metric -----> mrr 2017-12-19 21:29:05,062 INFO negative -----> 10 2017-12-19 21:29:05,063 INFO l2_reg -----> 0.0001 2017-12-19 21:29:05,063 INFO log -----> /home/june/IdeaProjects/knowledge-graph-embeddings/src/20171219_21:29 2017-12-19 21:29:05,063 INFO graphall -----> None 2017-12-19 21:29:05,063 INFO epoch -----> 500 2017-12-19 21:29:05,063 INFO valid -----> ../dat/iel/validation 2017-12-19 21:29:05,064 INFO ent -----> ../dat/iel/nodes_id 2017-12-19 21:29:05,064 INFO rel -----> ../dat/iel/rels 2017-12-19 21:29:05,064 INFO filtered -----> False 2017-12-19 21:29:05,064 INFO method -----> complex 2017-12-19 21:29:05,064 INFO save_step -----> 100 2017-12-19 21:29:05,064 INFO opt -----> sgd 2017-12-19 21:29:05,065 INFO gradclip -----> -5 2017-12-19 21:29:05,065 INFO train -----> ../dat/iel/train 2017-12-19 21:29:05,065 INFO nbest -----> None 2017-12-19 21:29:05,065 INFO dim -----> 200 2017-12-19 21:29:05,065 INFO cp_ratio -----> 0.5 2017-12-19 21:29:05,065 INFO batch -----> 128 2017-12-19 21:29:05,065 INFO lr -----> 0.002 2017-12-19 21:29:05,065 INFO mode -----> single 2017-12-19 21:29:05,065 INFO margin -----> 1.0 2017-12-19 21:31:36,578 INFO preparing data... 2017-12-19 21:31:37,351 INFO building model... 2017-12-19 21:31:38,066 INFO setup trainer... 2017-12-19 21:31:38,066 INFO start 1 epoch 2017-12-19 21:35:09,526 INFO evaluation metric in 1 epoch: 0.0 2017-12-19 21:35:09,526 INFO evaluation time in 1 epoch: 46.6910860538 2017-12-19 21:35:09,526 INFO < Current Best metric: 0.0 (1 epoch) > 2017-12-19 21:35:09,623 INFO training loss in 1 epoch: 1895.80181681 2017-12-19 21:35:09,624 INFO training time in 1 epoch: 211.557569027 2017-12-19 21:35:09,624 INFO start 2 epoch ...

2017-12-21 01:33:41,113 INFO training time in 496 epoch: 201.107445955 2017-12-21 01:33:41,113 INFO start 497 epoch 2017-12-21 01:37:02,495 INFO evaluation metric in 497 epoch: 0.0003 2017-12-21 01:37:02,495 INFO evaluation time in 497 epoch: 46.8026819229 2017-12-21 01:37:02,495 INFO < Current Best metric: 0.0013 (77 epoch) > 2017-12-21 01:37:02,495 INFO training loss in 497 epoch: 12.7899311581 2017-12-21 01:37:02,495 INFO training time in 497 epoch: 201.382302999 2017-12-21 01:37:02,495 INFO start 498 epoch 2017-12-21 01:40:23,089 INFO evaluation metric in 498 epoch: 0.0004 2017-12-21 01:40:23,089 INFO evaluation time in 498 epoch: 45.9419050217 2017-12-21 01:40:23,090 INFO < Current Best metric: 0.0013 (77 epoch) > 2017-12-21 01:40:23,090 INFO training loss in 498 epoch: 12.8672754978 2017-12-21 01:40:23,090 INFO training time in 498 epoch: 200.594673872 2017-12-21 01:40:23,090 INFO start 499 epoch 2017-12-21 01:43:43,266 INFO evaluation metric in 499 epoch: 0.0003 2017-12-21 01:43:43,266 INFO evaluation time in 499 epoch: 45.6540539265 2017-12-21 01:43:43,266 INFO < Current Best metric: 0.0013 (77 epoch) > 2017-12-21 01:43:43,266 INFO training loss in 499 epoch: 12.9704255703 2017-12-21 01:43:43,266 INFO training time in 499 epoch: 200.176346064 2017-12-21 01:43:43,266 INFO start 500 epoch 2017-12-21 01:47:04,486 INFO evaluation metric in 500 epoch: 0.0004 2017-12-21 01:47:04,487 INFO evaluation time in 500 epoch: 45.9023690224 2017-12-21 01:47:04,487 INFO < Current Best metric: 0.0013 (77 epoch) > 2017-12-21 01:47:10,554 INFO training loss in 500 epoch: 12.5791623812 2017-12-21 01:47:10,554 INFO training time in 500 epoch: 207.287909985 2017-12-21 01:47:12,624 INFO ===== Best metric: 0.0013 (77 epoch) ===== 2017-12-21 01:47:12,624 INFO done all

the train loss is actually drop down.

log.txt

mana-ysh commented 6 years ago

@zhouhoo

Thanks. When MRR is 0.001 or so, Hits@{1, 3, 10} can be all zero because “MRR is 0.001” means the rank of each correct triplets is about 1000 on average. So if you evaluate with Hits@1000 or something, then you should get non-zero value.

zhouhoo commented 6 years ago

@mana-ysh thank you so much. so the complex model results are bad , Maybe I should try another model and tune the hyper parameter.

mana-ysh / knowledge-graph-embeddings

Filtered metrics are all zero #1