malllabiisc / EmbedKGQA

ACL 2020: Improving Multi-hop Question Answering over Knowledge Graphs using Knowledge Base Embeddings
Apache License 2.0
417 stars 95 forks source link

KG embedding #11

Closed Wangyinquan closed 4 years ago

Wangyinquan commented 4 years ago

Could you provide the setting of KG embedding modulo? i.e. how can I get E.py&R.py with your train_embedding code. I cannot train it properly with default setting.

apoorvumang commented 4 years ago

The train_embedding code is for training on large KG such as FreeBase. For MetaQA, we recommend you use code such as https://github.com/ibalazevic/TuckER to get the embeddings and save in dictionary format as E.npy and R.npy (this is how we did it). Also since TuckER code uses batch normalization, you need to save batch normalization parameters as well.

Wangyinquan commented 4 years ago

Why the trained model from train_embedding is so weak when I set do_batch_norm to 0? Best valid: [0.05677566687091254, 4651.813471502591, 0.10436713545521836, 0.06439674315321983, 0.03195164075993091]

It works well if i set do_batch_norm to 1.

apoorvumang commented 4 years ago

This is something that took us a while to figure out as well. Apparently https://github.com/ibalazevic/TuckER uses batch normalization in their implementation, and batch normalization has some learned parameters, namely bias, running mean and running variance (https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm1d.html). If we are using batch normalization while training embeddings, we must use them in our model as well, because the scoring function changes . I believe if we use an embedding training implementation where batch norm is not used, we can get away with not using it.

(You may have seen in the code that we load some bn parameters as well, along with E.npy)

Here however it is necessary if we want to keep the same scoring function as was used while training embeddings (TuckER code)

apoorvumang commented 4 years ago

You could also try by setting do_batch_norm to 1 and then freezing the batch norm parameters. This way, no real batch norm is done (parameters are fixed, don't depend on new data that is coming) but the code should work fairly well.

Wangyinquan commented 4 years ago

If I set do_batch_norm to 0, the KGE module (train_embedding code) should be the same as original ComplEx, right? This result is from KGE module rather than QA. I trained 500 epoches with do_batch_norm=1 and got this:
Best valid: [0.05677566687091254, 4651.813471502591, 0.10436713545521836, 0.06439674315321983, 0.03195164075993091]

apoorvumang commented 4 years ago

Oh got it, sorry for the misunderstanding. Try the following command:

 CUDA_VISIBLE_DEVICES=3 python main.py --dataset MetaQA --num_iterations 500 --batch_size 256 \
                                       --lr 0.0005 --dr 1.0 --edim 200 --rdim 200 --input_dropout 0.2 \
                                       --hidden_dropout1 0.3 --hidden_dropout2 0.3 --label_smoothing 0.1 \
                                       --valid_steps 10 --model ComplEx \
                                       --loss_type BCE --do_batch_norm 0 --l3_reg 0.001

I got following results: image

(Not yet converged, just showing midway result)

Wangyinquan commented 4 years ago

Thanks, it works well

sharon-gao commented 4 years ago

Oh got it, sorry for the misunderstanding. Try the following command:

 CUDA_VISIBLE_DEVICES=3 python main.py --dataset MetaQA --num_iterations 500 --batch_size 256 \
                                       --lr 0.0005 --dr 1.0 --edim 200 --rdim 200 --input_dropout 0.2 \
                                       --hidden_dropout1 0.3 --hidden_dropout2 0.3 --label_smoothing 0.1 \
                                       --valid_steps 10 --model ComplEx \
                                       --loss_type BCE --do_batch_norm 0 --l3_reg 0.001

I got following results: image

(Not yet converged, just showing midway result)

Hi! I got quite different results when setting model to be 'SimplE' but with datasets 'MetaQA' and 'MetaQA_half' respectively.

With 'MetaQA': Best valid: [0.8958811624412727, 288.4449790278806, 0.9332593140883296, 0.918085368862571, 0.8696027633851469] Best Test: [0.8883614038125973, 337.9813764183522, 0.9311790823877651, 0.9124321657622102, 0.8595214602861372] Dataset: MetaQA Model: SimplE

With 'MetaQA_half': Best valid: [0.09818482275952602, 9011.535875, 0.1655, 0.107125, 0.06425] Best Test: [0.0959381995657658, 8872.50025, 0.159125, 0.10625, 0.062125] Dataset: MetaQA_half Model: SimplE

The hyper-parameters are set the same. Have you ever come across this problem?

maxinsi commented 5 months ago

Hello, may I ask what is the problem that after I use the hyperparameters you provide to learn, the MRR is only about 0.5 and it ranks high on average, but doesn't have a high top-10 shooting percentage +--------------------+--------------------+ | Metric | Result | +--------------------+--------------------+ | Hits@10 | 0.7760236803157375 | | Hits@3 | 0.6213616181549088 | | Hits@1 | 0.4037987173162309 | | MeanRank | 95.99296990626542 | | MeanReciprocalRank | 0.5335250755543975 | +--------------------+--------------------+ +------------+-----------------------------------------------------------------------------------------------------+ | ARTIFACT | VALUE | +------------+-----------------------------------------------------------------------------------------------------+ | Best valid | [0.537612634856118, 91.54453491241055, 0.7783123612139157, 0.6292869479397977, 0.4066123858869973] | | Best test | [0.5332891002897144, 95.93339911198817, 0.7763936852491367, 0.620004933399112, 0.40281203749383326] | | Dataset | MetaQA | | Model | ComplEx | +------------+-----------------------------------------------------------------------------------------------------+ Training-time: 8.57 +-----------------+--------+ | Parameter | Value | +-----------------+--------+ | Learning rate | 0.0005 | | Decay | 1.0 | | Dim | 200 | | Input drop | 0.2 | | Hidden drop 2 | 0.3 | | Label Smoothing | 0.1 | | Batch size | 256 | | Loss type | BCE | | L3 reg | 0.001 | +-----------------+--------+