TimDettmers / ConvE

Convolutional 2D Knowledge Graph Embeddings resources
MIT License
675 stars 163 forks source link

About the activation function #83

Open HammerWang98 opened 2 years ago

HammerWang98 commented 2 years ago

Hello , Tim. Have u experimented that replacing the sigmoid with softmax in the logits layer? I tried to run your code, but I found that I got a lower MRR score than the result your paper with sigmoid. When I changed it to softmax, I got a higher MRR score than u. I want to cite your paper in our experiments, could u tell me how to address this problem and use your result as our base. Thank u, looking forward to your reply.

TimDettmers commented 2 years ago

Hi! Thanks for raising this issue. While mathematically the logistic sigmoid should be the right thing to do, I have heard before that using a softmax actually performs better in practice. Some authors use softmax in practice. The focal loss or the enhanced version suggested by @saeedizade might be even better. What I would suggest in the experiments for your paper is to use a better framework with more baselines across different models. I would recommend PyKEEN which is actively developed and has a ConvE baseline. It should give you very robust baselines that makes it easy to compare to.