What kind of Attention has been used in the model?

nuric / deeplogic

DeepLogic: Towards End-to-End Differentiable Logical Reasoning

https://arxiv.org/abs/1805.07433

BSD 3-Clause "New" or "Revised" License

23 stars 7 forks source link

What kind of Attention has been used in the model? #13

Closed 14H034160212 closed 2 years ago

14H034160212 commented 3 years ago

Hi,

I am thinking about the code for Attention. I am wondering what kind of Attention has been used in the model?

For those screenshots, you defined the Attention dense function. Is it soft Attention or hard Attention, or you use dense to simulate the Attention here?

Thanks.

nuric commented 2 years ago

This is soft attention based on a learnable score obtained by att_dense. In this case, the sigmoid activation gives the score between 0 and 1 which corresponds to whether the rule is selected or not.