uclnlp / inferbeddings

Injecting Background Knowledge in Neural Models via Adversarial Set Regularisation
MIT License
59 stars 12 forks source link

Results (08/02/2017) #13

Closed pminervini closed 7 years ago

pminervini commented 7 years ago

Some early results are available here:

http://data.neuralnoise.com/inferbeddings/logs_08022017.tar.gz

Just decompress the file in the inferbeddings directory.

Those results are generated by jobs on the UCLCS cluster - the scripts generating the jobs have a UCL_ prefix and are available here:

https://github.com/uclmr/inferbeddings/tree/master/scripts/wn18 https://github.com/uclmr/inferbeddings/tree/master/scripts/fb15k

For checking the results - I've done a script that:

For example - results on WN18 with and without including rules:

$ ./tools/parse_results_filtered.sh logs/ucl_wn18_adv_v1/*.log
1080
Best MR, Filt: logs/ucl_wn18_adv_v1/ucl_wn18_adv_v1.adv_batch_size=1_adv_epochs=1_adv_lr=0.1_adv_weight=100_batches=10_disc_epochs=10_embedding_size=200_epochs=100_lr=0.1_margin=1_model=TransE_optimizer=adagrad_similarity=l2.log
Test - Best Filt MR: 140.9154

Best MRR, Filt: logs/ucl_wn18_adv_v1/ucl_wn18_adv_v1.adv_batch_size=1_adv_epochs=10_adv_lr=0.1_adv_weight=10000_batches=10_disc_epochs=10_embedding_size=100_epochs=100_lr=0.1_margin=1_model=TransE_optimizer=adagrad_similarity=l1.log
Test - Best Filt MRR: 0.493

Best H@1, Filt: logs/ucl_wn18_adv_v1/ucl_wn18_adv_v1.adv_batch_size=1_adv_epochs=10_adv_lr=0.1_adv_weight=10000_batches=10_disc_epochs=10_embedding_size=100_epochs=100_lr=0.1_margin=1_model=TransE_optimizer=adagrad_similarity=l1.log
Test - Best Filt Hits@1: 32.78%

Best H@3, Filt: logs/ucl_wn18_adv_v1/ucl_wn18_adv_v1.adv_batch_size=10_adv_epochs=10_adv_lr=0.1_adv_weight=100_batches=10_disc_epochs=10_embedding_size=50_epochs=100_lr=0.1_margin=1_model=TransE_optimizer=adagrad_similarity=l1.log
Test - Best Filt Hits@3: 84.57%

Best H@5, Filt: logs/ucl_wn18_adv_v1/ucl_wn18_adv_v1.adv_batch_size=10_adv_epochs=10_adv_lr=0.1_adv_weight=100_batches=10_disc_epochs=10_embedding_size=50_epochs=100_lr=0.1_margin=1_model=TransE_optimizer=adagrad_similarity=l1.log
Test - Best Filt Hits@5: 90.78%

Best H@10, Filt: logs/ucl_wn18_adv_v1/ucl_wn18_adv_v1.adv_batch_size=10_adv_epochs=10_adv_lr=0.1_adv_weight=100_batches=10_disc_epochs=10_embedding_size=50_epochs=100_lr=0.1_margin=1_model=TransE_optimizer=adagrad_similarity=l1.log
Test - Best Filt Hits@10: 93.06%

Without rules:

$ ./tools/parse_results_filtered.sh logs/ucl_wn18_adv_v1/*_adv_weight=0_*.log180
Best MR, Filt: logs/ucl_wn18_adv_v1/ucl_wn18_adv_v1.adv_batch_size=1_adv_epochs=0_adv_lr=0.1_adv_weight=0_batches=10_disc_epochs=10_embedding_size=200_epochs=100_lr=0.1_margin=1_model=TransE_optimizer=adagrad_similarity=l2.log
Test - Best Filt MR: 146.8016

Best MRR, Filt: logs/ucl_wn18_adv_v1/ucl_wn18_adv_v1.adv_batch_size=100_adv_epochs=1_adv_lr=0.1_adv_weight=0_batches=10_disc_epochs=10_embedding_size=50_epochs=100_lr=0.1_margin=1_model=TransE_optimizer=adagrad_similarity=l1.log
Test - Best Filt MRR: 0.372

Best H@1, Filt: logs/ucl_wn18_adv_v1/ucl_wn18_adv_v1.adv_batch_size=100_adv_epochs=1_adv_lr=0.1_adv_weight=0_batches=10_disc_epochs=1_embedding_size=20_epochs=100_lr=0.1_margin=1_model=TransE_optimizer=adagrad_similarity=l2.log
Test - Best Filt Hits@1: 16.62%

Best H@3, Filt: logs/ucl_wn18_adv_v1/ucl_wn18_adv_v1.adv_batch_size=100_adv_epochs=1_adv_lr=0.1_adv_weight=0_batches=10_disc_epochs=10_embedding_size=50_epochs=100_lr=0.1_margin=1_model=TransE_optimizer=adagrad_similarity=l1.log
Test - Best Filt Hits@3: 60.31%

Best H@5, Filt: logs/ucl_wn18_adv_v1/ucl_wn18_adv_v1.adv_batch_size=100_adv_epochs=1_adv_lr=0.1_adv_weight=0_batches=10_disc_epochs=10_embedding_size=50_epochs=100_lr=0.1_margin=1_model=TransE_optimizer=adagrad_similarity=l1.log
Test - Best Filt Hits@5: 70.16%

Best H@10, Filt: logs/ucl_wn18_adv_v1/ucl_wn18_adv_v1.adv_batch_size=1_adv_epochs=1_adv_lr=0.1_adv_weight=0_batches=10_disc_epochs=10_embedding_size=50_epochs=100_lr=0.1_margin=1_model=TransE_optimizer=adagrad_similarity=l1.log
Test - Best Filt Hits@10: 79.39%

Please note that the experiments in logs/ucl_fb15k_adv_v?.2 are still running (and most logfiles are incomplete). Those are experiments with a new ruleset I'm trying for FB15k - using clauses with higher support (minimum support here is 1000 instead of 100): this is related to https://github.com/uclmr/inferbeddings/issues/11