uclnlp / cqd

Continuous Query Decomposition for Complex Query Answering in Incomplete Knowledge Graphs
MIT License
95 stars 11 forks source link

2u/up queries reproduction with CQD @ KGReasoning #7

Closed migalkin closed 3 years ago

migalkin commented 3 years ago

Since there is no issue board at https://github.com/pminervini/KGReasoning I thought I could write it here and tag @pminervini 😃

I'm trying to run CQD CO and Beam on the BetaE version of FB15k-237 and NELL-995 datasets using that repo, but for some reason, the numbers for union queries are very low.

After downloading the pre-trained models (fb15k-237-betae and nell-betae, respectively), I'm using the following commands:

python main.py -cuda --do_test --data_path FB15k-237-betae --cpu_num 1 --geo cqd --tasks "1p.2p.3p.2i.3i.ip.pi.2u.up" --checkpoint_path models/fb15k-237-betae -d 1000
python main.py -cuda --do_test --data_path NELL-betae --cpu_num 1 --geo cqd --tasks "1p.2p.3p.2i.3i.ip.pi.2u.up" --checkpoint_path models/nell-betae -d 1000

Other hyperparams are set as default ones (there is no info on when to put --cqd-sigmoid-scores or --cqd-normalize-scores, so I presume they should be turned off).

The numbers for 2u/up FB15k-237:

Test 2u-DNF MRR at step 99999: 0.005257
Test 2u-DNF HITS1 at step 99999: 0.001895
Test 2u-DNF HITS3 at step 99999: 0.004898
Test 2u-DNF HITS10 at step 99999: 0.010378
est 2u-DNF num_queries at step 99999: 5000.000000
Test up-DNF MRR at step 99999: 0.016857
Test up-DNF HITS1 at step 99999: 0.005590
Test up-DNF HITS3 at step 99999: 0.014344
Test up-DNF HITS10 at step 99999: 0.033338
Test up-DNF num_queries at step 99999: 5000.000000

And for NELL:

Test 2u-DNF MRR at step 99999: 0.007676
Test 2u-DNF HITS1 at step 99999: 0.004144
Test 2u-DNF HITS3 at step 99999: 0.006924
Test 2u-DNF HITS10 at step 99999: 0.014262
Test 2u-DNF num_queries at step 99999: 4000.000000
Test up-DNF MRR at step 99999: 0.023296
Test up-DNF HITS1 at step 99999: 0.010295
Test up-DNF HITS3 at step 99999: 0.022723
Test up-DNF HITS10 at step 99999: 0.045247
Test up-DNF num_queries at step 99999: 4000.000000

Is there anything missing or those are expected numbers for betae datasets?

P.S. Would be good to have an example of how to properly run CQD with KGReasoning in the example.sh :)

pminervini commented 3 years ago

Hello @migalkin, yes indeed!! I will set up a document on how to get the best possible results by Monday!

Here's how to get nicer numbers on FB15k-237 and FB15k with the beam setting -- @dfdazac, could you comment about the co setting?

Here's for up:

PYTHONPATH=. python3 main.py --do_valid --do_test --data_path data/FB15k-237-betae -n 1 -b 1000 -d 1000 --cpu_num 0 --geo cqd --tasks up --print_on_screen --test_batch_size 1 --checkpoint_path models/fb15k-237-betae --cqd discrete --cqd-t-norm min --cqd-k 16

And here's for 2u:

PYTHONPATH=. python3 main.py --do_valid --do_test --data_path data/FB15k-237-betae -n 1 -b 1000 -d 1000 -lr 0.1 --cpu_num 0 --geo cqd --tasks 2u --print_on_screen --test_batch_size 1 --checkpoint_path models/fb15k-237-betae --cqd discrete --cqd-t-norm min --cqd-normalize  --cqd-k 16

You should get some nicer numbers, e.g.

$ PYTHONPATH=. python3 main.py --do_valid --do_test --data_path data/FB15k-betae -d 1000 --cpu_num 0 --geo cqd --tasks up --print_on_screen --test_batch_size 1 --checkpoint_path models/fb15k-betae --cqd discrete --cqd-t-norm min  --cqd-sigmoid --cqd-k 8192 --cuda

[..]

2021-09-16 10:47:13,976 INFO     Test up-DNF MRR at step 99999: 0.593980
2021-09-16 10:47:13,976 INFO     Test up-DNF HITS1 at step 99999: 0.524867
2021-09-16 10:47:13,976 INFO     Test up-DNF HITS3 at step 99999: 0.629156
2021-09-16 10:47:13,976 INFO     Test up-DNF HITS10 at step 99999: 0.721590
2021-09-16 10:47:13,976 INFO     Test up-DNF num_queries at step 99999: 8000.000000
2021-09-16 10:47:14,620 INFO     Test average MRR at step 99999: 0.593980
2021-09-16 10:47:14,620 INFO     Test average HITS1 at step 99999: 0.524867
2021-09-16 10:47:14,620 INFO     Test average HITS3 at step 99999: 0.629156
2021-09-16 10:47:14,621 INFO     Test average HITS10 at step 99999: 0.721590

Since there is no issue board at https://github.com/pminervini/KGReasoning I thought I could write it here and tag @pminervini 😃

Thank you, I had no idea there was no issue board for https://github.com/pminervini/KGReasoning, maybe the project has to be owned by an org in order to do it -- will look into it!

pminervini commented 3 years ago

But in general things like whether to use --cqd-sigmoid-scores or --cqd-normalize-scores, and --cqd-k are hyperparameters -- we select the best choice for each query type based on the results on the validation set and report the results on the test set using the best configuration. Trying out one configuration takes a few seconds, so it's not a costly process. The choice of --cqd-k pretty much depends on how much GPU memory you can use! :)

dfdazac commented 3 years ago

Hi @migalkin! For CQD-CO the main hyperparameters we experimented with were the choice of t-norm and the regularization coefficient. These were the best based on the validation set performance:

FB15k

python main.py --cuda --do_valid --do_test \
--data_path data/FB15k-betae -n 1 -b 1000 -d 1000 \
--geo cqd --tasks '1p.2p.3p.2i.3i.ip.pi.2u.up' \
--print_on_screen --test_batch_size 1000 \
--checkpoint_path 'models/fb15k-betae' \
--cqd continuous --cqd-t-norm prod --reg_weight 0.001 

Results:

Test 2u-DNF MRR at step 99999: 0.417489
Test 2u-DNF HITS1 at step 99999: 0.307748
Test 2u-DNF HITS3 at step 99999: 0.468769
Test 2u-DNF HITS10 at step 99999: 0.633633
Test 2u-DNF num_queries at step 99999: 8000.000000
Test up-DNF MRR at step 99999: 0.220525
Test up-DNF HITS1 at step 99999: 0.154713
Test up-DNF HITS3 at step 99999: 0.236309
Test up-DNF HITS10 at step 99999: 0.345788
Test up-DNF num_queries at step 99999: 8000.000000

FB15k-237

python main.py --cuda --do_valid --do_test \
--data_path data/FB15k-237-betae -n 1 -b 1000 -d 1000 \
--geo cqd --tasks '1p.2p.3p.2i.3i.ip.pi.2u.up' \
--print_on_screen --test_batch_size 1000 \
--checkpoint_path 'models/fb15k-237-betae' \
--cqd continuous --cqd-t-norm prod --reg_weight 0.001 

Results:

Test 2u-DNF MRR at step 99999: 0.145097
Test 2u-DNF HITS1 at step 99999: 0.085846
Test 2u-DNF HITS3 at step 99999: 0.149461
Test 2u-DNF HITS10 at step 99999: 0.259410
Test 2u-DNF num_queries at step 99999: 5000.000000
Test up-DNF MRR at step 99999: 0.082233
Test up-DNF HITS1 at step 99999: 0.040649
Test up-DNF HITS3 at step 99999: 0.080860
Test up-DNF HITS10 at step 99999: 0.161372
Test up-DNF num_queries at step 99999: 5000.000000

NELL:

python main.py --cuda --do_valid --do_test \
--data_path data/NELL-betae -n 1 -b 1000 -d 1000 \
--geo cqd --tasks '1p.2p.3p.2i.3i.ip.pi.2u.up' \
--print_on_screen --test_batch_size 1000 \
--checkpoint_path 'models/nell-betae' \
--cqd continuous --cqd-t-norm prod --reg_weight 0.001 

Results:

Test 2u-DNF MRR at step 99999: 0.172942
Test 2u-DNF HITS1 at step 99999: 0.099039
Test 2u-DNF HITS3 at step 99999: 0.186822
Test 2u-DNF HITS10 at step 99999: 0.320322
Test 2u-DNF num_queries at step 99999: 4000.000000
Test up-DNF MRR at step 99999: 0.131610
Test up-DNF HITS1 at step 99999: 0.074683
Test up-DNF HITS3 at step 99999: 0.139782
Test up-DNF HITS10 at step 99999: 0.243873
Test up-DNF num_queries at step 99999: 4000.000000

So essentially all use the product t-norm, and regularization of 0.001.

migalkin commented 3 years ago

Thank you guys 😊
I managed to get much better numbers with those settings! The key was to use --cqd-t-norm min for union queries with Beam, while results for the default CO settings well correspond to @dfdazac 's numbers

pminervini commented 3 years ago

@migalkin btw I just managed to enable Issues on https://github.com/pminervini/KGReasoning/ ! :)

migalkin commented 3 years ago

Perfect, thank you! Could you please transfer this issue to that repo's issue board?

pminervini commented 3 years ago

@migalkin wish I knew how to do that 😅

migalkin commented 3 years ago

@pminervini there is a button "Transfer issue" in the right-hand side panel under the Participants section, repo admins should see it and have the rights :) No probs if it remains here though