About the activation function

TimDettmers / ConvE

Convolutional 2D Knowledge Graph Embeddings resources

MIT License

675 stars 163 forks source link

Hi! Thanks for raising this issue. While mathematically the logistic sigmoid should be the right thing to do, I have heard before that using a softmax actually performs better in practice. Some authors use softmax in practice. The focal loss or the enhanced version suggested by @saeedizade might be even better. What I would suggest in the experiments for your paper is to use a better framework with more baselines across different models. I would recommend PyKEEN which is actively developed and has a ConvE baseline. It should give you very robust baselines that makes it easy to compare to.

TimDettmers / ConvE

About the activation function #83